Abhimanyu Pallavi Sudhir
http://www.rssmix.com/
This feed was created by mixing existing feeds from various sources.RSSMixComment by Abhimanyu Pallavi Sudhir on Because things smell, is everything evaporating?
https://physics.stackexchange.com/questions/515304/because-things-smell-is-everything-evaporating
I guess the essence of the question is: why are spontaneous reactions that produce gaseous products so common? Which probably has to do with the high entropy of gases or something.Thu, 21 Nov 2019 16:23:39 GMThttps://physics.stackexchange.com/questions/515304/because-things-smell-is-everything-evaporating?cid=1161380Abhimanyu Pallavi Sudhir2019-11-21T16:23:39ZComment by Abhimanyu Pallavi Sudhir on Because things smell, is everything evaporating?
https://physics.stackexchange.com/questions/515304/because-things-smell-is-everything-evaporating
The answer to the metal question is here: <a href="https://chemistry.stackexchange.com/questions/7916/why-can-we-smell-copper">Why can we smell copper?</a> and <a href="https://www.livescience.com/4233-coins-smell.html" rel="nofollow noreferrer">here</a>. I guess the standard haemogloubin explanation of the metallic smell of blood is false: <a href="https://www.quora.com/Why-does-blood-smell-like-copper/answer/Song-Chencheng" rel="nofollow noreferrer">Why does blood smell like copper?</a>Thu, 21 Nov 2019 16:20:04 GMThttps://physics.stackexchange.com/questions/515304/because-things-smell-is-everything-evaporating?cid=1161377Abhimanyu Pallavi Sudhir2019-11-21T16:20:04ZAnswer by Abhimanyu Pallavi Sudhir for the vector space of Magic Squares
https://math.stackexchange.com/questions/1692624/the-vector-space-of-magic-squares/3445000#3445000
0<p>Here's an easy way to do this for general <span class="math-container">$n$</span>: given a magic number <span class="math-container">$S$</span>, consider the topleft <span class="math-container">$(n-1)$</span> by <span class="math-container">$(n-1)$</span> submatrix of the square. Given these values, one can fill in the margins by subtracting rows and columns of the submatrix from <span class="math-container">$S$</span>, and the bottom-right entry by subtracting the diagonal of the submatrix from <span class="math-container">$S$</span>.</p>
<p>The only equations remaining to satisfy are: (1) the sum of each new margin equals <span class="math-container">$S$</span> and (2) the sum of the non-principal diagonal equals <span class="math-container">$S$</span>. The condition (1) is the same for each margin (because the last column can be determined given all the rows and all the other columns). So the conditions are, where <span class="math-container">$1\le i,j\le n$</span> and <span class="math-container">$1\lt k\lt n$</span>:</p>
<p><span class="math-container">$$\sum_{i}a_{ii}=(n-1)S-\sum_{ij}a_{ij}$$</span></p>
<p><span class="math-container">$$\sum_k a_{k(n-k+1)}+\sum_j a_{1j}+\sum_i a_{i1}=S$$</span></p>
<p>These can be checked to be linearly independent for <span class="math-container">$n>2$</span>. Allowing <span class="math-container">$S$</span> to be free, the dimension of our space is therefore <span class="math-container">$(n-1)^2-2+1$</span>, which equals: </p>
<p><span class="math-container">$$n^2-2n$$</span></p>
<p>Which indeed gives 3 in the case <span class="math-container">$n=3$</span> Meanwhile, for <span class="math-container">$n=1$</span> and <span class="math-container">$n=2$</span>, the dimension is clearly 1.</p>Thu, 21 Nov 2019 12:04:55 GMThttps://math.stackexchange.com/questions/1692624/-/3445000#3445000Abhimanyu Pallavi Sudhir2019-11-21T12:04:55ZComment by Abhimanyu Pallavi Sudhir on Can't distinguish non-orthogonal state vectors
https://physics.stackexchange.com/questions/514375/cant-distinguish-non-orthogonal-state-vectors
Consider the preimage of a vector under a Hermitian projection operator.Sun, 17 Nov 2019 12:17:56 GMThttps://physics.stackexchange.com/questions/514375/cant-distinguish-non-orthogonal-state-vectors?cid=1159268Abhimanyu Pallavi Sudhir2019-11-17T12:17:56ZComment by Abhimanyu Pallavi Sudhir on Why are objects at rest in motion through spacetime at the speed of light?
https://physics.stackexchange.com/questions/33840/why-are-objects-at-rest-in-motion-through-spacetime-at-the-speed-of-light/410575#410575
I did say that it's a convention about normalisation ("yes, you can choose other parameterisations"). I'm saying it's not an arbitrary convention, in the sense that it's perfectly sensible to ask that $\tau$ becomes $t$ for a stationary object.Tue, 12 Nov 2019 11:06:21 GMThttps://physics.stackexchange.com/questions/33840/why-are-objects-at-rest-in-motion-through-spacetime-at-the-speed-of-light/410575?cid=1157162#410575Abhimanyu Pallavi Sudhir2019-11-12T11:06:21ZFinancial derivatives, payoff functions and portfolios: motivation
https://thewindingnumber.blogspot.com/2019/11/financial-derivatives-payoff-functions.html
0When I first saw the definitions of several financial assets, I found them completely arbitrary -- it's not that I didn't get the reason one would have them, but rather that I saw no way to immediately understand them or a starting point for reasoning about them mathematically. Other than what was perhaps the most basic asset -- stocks (and also bonds, etc., which are really the "same" idea -- I prefer to think of stocks and bonds as being on a continuum based on riskiness) and their baskets -- all the derivatives (and things that aren't called derivatives) based on them seemed really artificial in their construction.<br /><br />But this isn't exactly unfamiliar territory, is it? You've seen unmotivated definitions in mathematics, and you've seen that you need to put in quite a bit of effort to really motivate them and understand why they make perfect sense -- you've seen that, e.g. <a href="https://thewindingnumber.blogspot.com/">here</a>.<br /><br />So let's do the same thing with finance.<br /><br /><hr /><br />Let's start with a simple one: <b>shorting</b>. <br /><br />There is a certain asymmetry in the definitions of longing and shorting, isn't there? It's the "borrowing a stock" part of the definition of shorting that introduces this asymmetry.<br /><br />But if you've spent any time thinking about economics, the idea of borrowing something you don't have should be familiar -- it's what you do when you don't have any investment capital to start with, but you think you can grow the value of what you've borrowed by e.g. investing it in a stock. Let's phrase this in a slightly different (and by "slightly different", I mean "take the buying-selling dual of") way:<br /><br /><b>How to invest in a stock without money at hand:</b> Borrow some money, immediately "sell" the money for some stocks -- after some time has passed, "buy" back the money by returning the stocks. If the value of the stocks have increased, you'll get more money in return and be able to repay the loan.<br /><br />This is <i>precisely</i> symmetric to the situation of shorting -- <b>longing an asset</b> just means <i>shorting money</i> -- or more precisely, <b>shorting the rest of the market</b>.<br /><br />The apparent asymmetry between longing and shorting comes back from the fact that you are much more likely to already own some of "the rest of the market" than to own a particular stock -- for example, the unbounded losses of shorting arise from the fact that it's much easier for a single stock's value to skyrocket than for money's -- so in longing, there may still be ways for you to earn the money to repay it even if the value of the stock drops, i.e. the value of your other assets (e.g. your labour or property) relative to money would not have dropped.<br /><br />One advantage of this approach is that it is conceptually interesting -- and will hopefully allow us to transfer insights and ideas between stocks and shorts (except when certain approximations may be involved) -- another is that it immediately nullifies "moral" criticism of shorting, from e.g. Elon Musk, as it is really just the same as investing in the "rest of the market".<br /><br /><div class="twn-pitfall">Wait a minute -- but what if you actually just invested in "the rest of the market"? That would clearly have a much lower return than shorting the stock directly, right? Except you're thinking about investing in the rest of the market by paying money, not by paying the stock you're betting against -- that's a bet for the rest of the market against money, not against said stock.</div><br /><hr /><br />Well, shorting was an example where we wanted to bet that the price of an asset goes down. But in general, we may have any sort of weird prediction on the price of an asset -- maybe that it will "fluctuate a lot", or that it "won't exceed a certain level", or that it "will go up but only to a point", or that it "will reach a certain range". You may have any sort of elaborate <i>probability distribution</i> $\rho(x)$ on the value $x$ of the asset after a period of time. Given such a distribution, what you'd want to do (<b>ignoring risk</b>) is to maximise your expected return (minus the cost of buying the contract, of course):<br /><br />$$\chi=\int {\rho (x)f(x)dx} $$<br />Where $f(x)$ is the payoff you get if the asset reaches the price $x$ -- this is called the <b>payoff function</b>.<br /><br />Well, why not just take $f(x)$ to be arbitrarily high? Because the contract will be really expensive, of course. How expensive? Predicting that would require:<br /><ul><li>not only the $\rho$ distribution on this asset as believed by each seller and buyer in the market</li><li>but also the amount of capital they have and their beliefs about the future behavior of other assets in the market contracts on which they could buy instead</li></ul><div>And that is still not to mention the fact that people do not maximise the expected value of profit per say, but have varying levels of risk aversion.<br /><br />But that's alright -- we don't need to predict that. That price is crunched for us by the market and is the market price of the contract -- it is the <b>market price</b>. What's more important is to estimate $\chi = E_{\rho}[f(x)]$. Well, in fact, if we're concerned with <b>risk</b>, then we'd also be interested in the variance of the distribution -- and in general, an individual may also have a skewness or kurtosis preference (an example of a kurtosis preference would be among gamblers, who want heavy tails for the "big win").<br /><br />In fact, $\chi$ can depend on multiple underlying assets:<br /><br />$$\chi=E_\rho[f(\mathbf{x})]$$<br />Where $\mathbf{x}$ is the vector of prices of each underlying asset. In fact, this multivariate $f$ can represent your entire <b>portfolio</b> of derivatives on assets. If $f(\mathbf{x})$ can be written as a sum of functions of each component, this can be considered as some number of separate univariate derivatives -- the reason such a portfolio is still useful is that of risk management, especially if we use a $\rho$ that has some correlations (even otherwise, one may use a portfolio to mitigate risk but correlations allow us to target specific risks).<br /><br /><div class="twn-pitfall">There is an alternative definition of the payoff function, where it is $f(x)$ minus the contract price, i.e. a <b>profit/loss function</b>. The problem with this is that not every function can be a profit/loss function. But it often does make sense.<br /><br />(Think about how one may define a payoff function for shorting (shorting traditionally isn't considered a derivative because it isn't a contract, but I think that's an arbitrary distinction) -- the analog of the "contract price" is then the <i>negative</i> price you "buy" it at (i.e. the negative of the price you initially sell the stock you borrowed), and the negative value that you eventually "get" (i.e. the negative of the price you eventually sell it at) is the payoff function. So the payoff function is $-x$, and is indeed the reflection in the asset value axis of the payoff for a long. Check that the profit/loss functions are also reflections, albeit the interest on the stock you borrowed.)</div><br /><hr /><br />It's crucial to get some practice constructing various financial derivatives, i.e. constructing derivatives that have a given payoff function (using the first definition).<br /><br />$$f(x)=(a-x)I(x<a)$$<br /><div class="separator" style="clear: both; text-align: center;"><a href="https://1.bp.blogspot.com/-3s_6o7haZyo/XcN1Ut7NCDI/AAAAAAAAF1w/QZaoPDdnOjEEs1A8fhGATkNot3K8gC9AQCLcBGAsYHQ/s1600/put.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="896" data-original-width="720" height="320" src="https://1.bp.blogspot.com/-3s_6o7haZyo/XcN1Ut7NCDI/AAAAAAAAF1w/QZaoPDdnOjEEs1A8fhGATkNot3K8gC9AQCLcBGAsYHQ/s320/put.png" width="257" /></a></div>Such a function would be a useful alternative to shorting, as it doesn't allow arbitrary losses.<br /><br />The whole discontinuity of the function really suggests to me a fundamental change in behaviour at the point $x=a$ -- like you just don't make the trade if $x\ge a$. This decision can only be made once the final price is discovered, so you must have bought a contract that gave you the <i>option</i> to make a transaction: that transaction must be <i>selling</i>, it must be executed after the price is realised, but it must be at price $a$, which is initially fixed.<br /><br />This is called a <b>put option</b> -- you buy the <i>option</i> to sell a stock at a pre-decided price. To exercise the option, you instantly buy the stock and sell it at that pre-decided price. Obviously, this price matters -- otherwise, you would be getting a guaranteed nonnegative profit. This is really equivalent to insurance.<br /><br />(Verify that the payoff diagram of the seller of the put option is the negative of that of what's above.)<br /><br /><hr /><br />There's a natural analog of this notion that reduces risks with longing.<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://1.bp.blogspot.com/-gS71M3H0XoI/XdQCVmDDAQI/AAAAAAAAF24/IMIC43CrO2MF-kctJqIUtXI_BcL1pNMtgCLcBGAsYHQ/s1600/call.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="320" data-original-width="257" src="https://1.bp.blogspot.com/-gS71M3H0XoI/XdQCVmDDAQI/AAAAAAAAF24/IMIC43CrO2MF-kctJqIUtXI_BcL1pNMtgCLcBGAsYHQ/s1600/call.png" /></a></div>Once again, we see that there's a fundamental change of behavior if the price drops below $x=a$ -- you just don't complete the transaction. So you've bought an <i>option</i> to do something. Well, you need to sell something to make money, but the intercept of the graph suggests that you're also buying the asset, albeit at a fixed price. So this is a <b>call option</b> -- you buy the <i>option</i> to buy a stock at a pre-decided price. To exercise the option, you exercise it, then immediately sell the stock you bought to make your profit.<br /><br />(Once again, the payoff diagram is a bit misleading and suggests that this is strictly worse than just buying a stock -- remember that the cost of a stock is the entire original stock price, while the cost of the call option is much smaller. These costs are not integrated into the payoff diagrams, but are into the profit/loss diagrams.)<br /><br />Essentially, call and put options allow you to work on hindsight.<br /><br />One might wonder that a call option is perhaps not <i>as</i> useful as a put option -- there's not much to insure with longing, right? (compared to shorting) Perhaps, but there are certain other uses of call options that work together with put options in an interesting way, as we will soon see.</div>call optionsderivativesfinanceoptionsportfolioput optionsshortingstocksThu, 07 Nov 2019 02:02:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-1561541293853707792Abhimanyu Pallavi Sudhir2019-11-07T02:02:00ZComment by Abhimanyu Pallavi Sudhir on Physical interpretation of complex numbers, part 2
https://physics.stackexchange.com/questions/512297/physical-interpretation-of-complex-numbers-part-2
I think you mean to say "if you scale by i, you rotate it 90 degrees". That's correct.Wed, 06 Nov 2019 13:02:59 GMThttps://physics.stackexchange.com/questions/512297/physical-interpretation-of-complex-numbers-part-2?cid=1154461Abhimanyu Pallavi Sudhir2019-11-06T13:02:59Zggplot aes: alpha gets "smoothed out"
https://stackoverflow.com/questions/58524281/ggplot-aes-alpha-gets-smoothed-out
1<p>I'm using <code>ggplot</code> in the <code>ggplot2</code> R package, with the <code>mpg</code> data set. </p>
<pre><code>classify = function(cls){
if (cls == "suv" || cls == "pickup"){result = 1}
else {result = 0}
return(result)
}
mpg = mpg %>% mutate(size = sapply(class, classify))
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy, alpha = size))
</code></pre>
<p>Now, <code>size</code> can take only two values: 1 when class is <code>suv</code> or <code>pickup</code>, and 0 otherwise. But I get a weird "smooth" range of sizes in the resulting plot:</p>
<p><a href="https://i.stack.imgur.com/Plzzy.png" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/Plzzy.png" alt="enter image description here"></a></p>
<p>(It's not the legend that surprises me, but the fact that there are actually values plotted with alpha 0.1 or 0.3 or whatever.)</p>
<p>What's going on?</p>rggplot2alphaalpha-transparencyWed, 23 Oct 2019 13:43:29 GMThttps://stackoverflow.com/q/58524281Abhimanyu Pallavi Sudhir2019-10-23T13:43:29ZComment by Abhimanyu Pallavi Sudhir on If a triangle has rational coordinates, does it have rational area?
https://math.stackexchange.com/questions/490165/if-a-triangle-has-rational-coordinates-does-it-have-rational-area/490170#490170
@RichardRast How is that calculus?Wed, 23 Oct 2019 11:17:03 GMThttps://math.stackexchange.com/questions/490165/if-a-triangle-has-rational-coordinates-does-it-have-rational-area/490170?cid=7004660#490170Abhimanyu Pallavi Sudhir2019-10-23T11:17:03ZComment by Abhimanyu Pallavi Sudhir on Axiomatic Treatment Of Sum Operators Which Work On Divergent Series
https://math.stackexchange.com/questions/3402550/axiomatic-treatment-of-sum-operators-which-work-on-divergent-series
By the way, why do you suspect that equality to the true sum operator may be too strong?Mon, 21 Oct 2019 14:01:01 GMThttps://math.stackexchange.com/questions/3402550/axiomatic-treatment-of-sum-operators-which-work-on-divergent-series?cid=6999381Abhimanyu Pallavi Sudhir2019-10-21T14:01:01ZComment by Abhimanyu Pallavi Sudhir on Axiomatic Treatment Of Sum Operators Which Work On Divergent Series
https://math.stackexchange.com/questions/3402550/axiomatic-treatment-of-sum-operators-which-work-on-divergent-series/3402796#3402796
@fweth If you're really interested in getting started with Lean, there's a nice <a href="https://leanprover.zulipchat.com" rel="nofollow noreferrer">discussion/QA forum</a> you may be interested in.Mon, 21 Oct 2019 13:55:40 GMThttps://math.stackexchange.com/questions/3402550/axiomatic-treatment-of-sum-operators-which-work-on-divergent-series/3402796?cid=6999365#3402796Abhimanyu Pallavi Sudhir2019-10-21T13:55:40ZAnswer by Abhimanyu Pallavi Sudhir for Axiomatic Treatment Of Sum Operators Which Work On Divergent Series
https://math.stackexchange.com/questions/3402550/axiomatic-treatment-of-sum-operators-which-work-on-divergent-series/3402796#3402796
2<p><strong>EDIT:</strong></p>
<p>My original answer actually defined a trivial operator -- the fixed formalisation is credit to <a href="https://math.stackexchange.com/users/328173/kenny-lau">Kenny Lau</a> on <a href="https://leanprover.zulipchat.com/#narrow/stream/116395-maths/topic/Axiomatised.20summations/near/178678162" rel="nofollow noreferrer">Zulip</a> (see the link for discussion regarding non-triviality).</p>
<pre><code>import data.real.basic linear_algebra.basis data.finset data.fintype ring_theory.algebra_operations
open classical
open finset
local attribute [instance, priority 0] prop_decidable
structure is_sum (S : (ℕ → ℝ) → ℝ → Prop) : Prop :=
(wd : ∀ {a s t}, S a s → S a t → s = t)
(sum_add : ∀ {a b s t}, S a s → S b t → S (λ n, a n + b n) (s + t))
(sum_smul : ∀ {a s} c, S a s → S (λ n, c * a n) (c * s))
(sum_shift : ∀ {a s}, S a s → S (λ n, a (n + 1)) (s - a 0))
def has_sum (a : ℕ → ℝ) (s : ℝ) := ∀ S, is_sum S → ∀ t, S a t → t = s
theorem sum_of_has_sum (a : ℕ → ℝ) (s : ℝ) (H : has_sum a s)
(S : (ℕ → ℝ) → ℝ → Prop) (HS : is_sum S) (t : ℝ) (Ht : S a t) :
S a s :=
by rwa (H S HS t Ht).symm
theorem has_sum_alt : has_sum (λ n, (-1) ^ n) (1/2) :=
begin
intros S HS t Ht,
have H3 := HS.sum_shift Ht,
have H2 := HS.sum_smul (-1) Ht,
have H0 := HS.wd H2 H3,
change _ = t - 1 at H0,
linarith,
end
theorem has_sum_alt_id : has_sum (λ n, (-1) ^ n * n) (-1/4) :=
begin
intros S HS t Ht,
have HC : ∀ n : ℕ, (-1 : ℝ) ^ (n + 1) * (n + 1 : ℕ) + (-1) ^ n * n = (-1) * (-1) ^ n :=
λ n, by rw [pow_succ, nat.cast_add, mul_add, nat.cast_one, mul_one, add_comm, ←add_assoc, neg_one_mul,
neg_mul_eq_neg_mul_symm, add_neg_self, zero_add],
have H3 := HS.sum_shift Ht,
have H1 := HS.sum_add H3 Ht,
have H2 := HS.sum_smul (-1) H1,
simp only [nat.cast_zero, mul_zero, sub_zero, HC, neg_one_mul, neg_neg] at H2,
have H4 := has_sum_alt S HS _ H2,
linarith,
end
def fib : ℕ → ℝ
| 0 := 0
| 1 := 1
| (n + 2) := fib n + fib (n + 1)
theorem has_sum_fib : has_sum fib (-1) :=
have HC : ∀ n, fib n + fib (n + 1) = fib (n + 2) := λ n, rfl,
begin
intros S HS t Ht,
have H3 := HS.sum_shift Ht,
have H33 := HS.sum_shift H3,
have H1 := HS.sum_add Ht H3,
have H0 := HS.wd H1 H33, -- can use linearity instead of wd
simp only [fib, sub_zero] at H0,
linarith,
end
-- if a sequence has two has_sums, everything is its sum
-- (this is the case of not being summable, e.g. 1+1+1+...)
theorem has_sum_test_un (a : ℕ → ℝ) (s t : ℝ) (hst : s ≠ t) :
has_sum a s → has_sum a t → ∀ s, has_sum a s :=
λ hs ht u S HS v Hv, false.elim (hst (ht S HS s (sum_of_has_sum a s hs S HS v Hv)))
open submodule
-- a sum operator that is "forced" to give a the sum s
-- a valid sum operator iff the shifts of a are linearly independent
-- in which case a can have any sum, and thus has_sum nothing
def forced_sum (s : ℕ → ℝ) (H : linear_independent ℝ (λ m n : ℕ, s (n + m))) (S : ℝ) : (ℕ → ℝ) → ℝ → Prop :=
λ t T,
if Ht : t ∈ span ℝ (set.range (λ m n : ℕ, s (n + m))) then
if T = finsupp.sum (linear_independent.repr H ⟨t, Ht⟩) (λ n r, r * (S - finset.sum (@finset.univ (fin n) _) (λ k, s k.val)))
then true
else false
else false
-- linear algebra lemma
lemma spanning_set_subset_span {R M : Type} [ring R] [add_comm_group M] [module R M] {s : set M} :
s ⊆ span R s := span_le.mp (le_refl _)
-- forced_sum does what we want
lemma forced_sum_val (s : ℕ → ℝ) (H : linear_independent ℝ (λ m n : ℕ, s (n + m))) (S : ℝ) :
forced_sum s H S s S :=
begin
have Hs₁ : s ∈ set.range (λ m n : ℕ, s (n + m)) := set.mem_range.mpr ⟨0, by simp only [add_zero]⟩,
have Hs₂ : s ∈ span ℝ (set.range (λ m n : ℕ, s (n + m))) := spanning_set_subset_span _ Hs₁,
unfold forced_sum,
split_ifs,
end
-- i guess this is the hard part
lemma is_sum_forced_sum (s : ℕ → ℝ) (H : linear_independent ℝ (λ m n : ℕ, s (n + m))) (S : ℝ) :
is_sum (forced_sum s H S) :=
⟨begin end,
begin end,
begin end,
begin end⟩
theorem no_sum_of_lin_ind_shifts (s : ℕ → ℝ) (H : linear_independent ℝ (λ m n : ℕ, s (n + m))) :
∀ S : ℝ, ¬ has_sum s S :=
λ S HS,
have X : _ := HS (forced_sum s H (S + 1)) (is_sum_forced_sum s H (S + 1)) (S + 1) (forced_sum_val s H (S + 1)),
by linarith
-- CHALLENGE: formalise the proof here:
-- https://leanprover.zulipchat.com/#narrow/stream/116395-maths/topic/Axiomatised.20summations/near/178884724
-- REQUIRES GENERATING FUNCTIONS, TAYLOR SERIES -- not currently in Lean!
theorem inv_shifts_lin_ind : linear_independent ℝ (λ m n : ℕ, 1 / (n + m + 1)) :=
begin
end
</code></pre>
<p>Feel free to <a href="https://leanprover-community.github.io/lean-web-editor/#code=import%20data.real.basic%20linear_algebra.basis%20data.finset%20data.fintype%20ring_theory.algebra_operations%0Aopen%20classical%0Aopen%20finset%20%0A%0Alocal%20attribute%20%5Binstance%2C%20priority%200%5D%20prop_decidable%0A%0Astructure%20is_sum%20%28S%20%3A%20%28%E2%84%95%20%E2%86%92%20%E2%84%9D%29%20%E2%86%92%20%E2%84%9D%20%E2%86%92%20Prop%29%20%3A%20Prop%20%3A%3D%0A%28wd%20%3A%20%E2%88%80%20%7Ba%20s%20t%7D%2C%20S%20a%20s%20%E2%86%92%20S%20a%20t%20%E2%86%92%20s%20%3D%20t%29%0A%28sum_add%20%3A%20%E2%88%80%20%7Ba%20b%20s%20t%7D%2C%20S%20a%20s%20%E2%86%92%20S%20b%20t%20%E2%86%92%20S%20%28%CE%BB%20n%2C%20a%20n%20%2B%20b%20n%29%20%28s%20%2B%20t%29%29%0A%28sum_smul%20%3A%20%E2%88%80%20%7Ba%20s%7D%20c%2C%20S%20a%20s%20%E2%86%92%20S%20%28%CE%BB%20n%2C%20c%20*%20a%20n%29%20%28c%20*%20s%29%29%0A%28sum_shift%20%3A%20%E2%88%80%20%7Ba%20s%7D%2C%20S%20a%20s%20%E2%86%92%20S%20%28%CE%BB%20n%2C%20a%20%28n%20%2B%201%29%29%20%28s%20-%20a%200%29%29%0A%0Adef%20has_sum%20%28a%20%3A%20%E2%84%95%20%E2%86%92%20%E2%84%9D%29%20%28s%20%3A%20%E2%84%9D%29%20%3A%3D%20%E2%88%80%20S%2C%20is_sum%20S%20%E2%86%92%20%E2%88%80%20t%2C%20S%20a%20t%20%E2%86%92%20t%20%3D%20s%0A%0Atheorem%20sum_of_has_sum%20%28a%20%3A%20%E2%84%95%20%E2%86%92%20%E2%84%9D%29%20%28s%20%3A%20%E2%84%9D%29%20%28H%20%3A%20has_sum%20a%20s%29%20%0A%20%20%28S%20%3A%20%28%E2%84%95%20%E2%86%92%20%E2%84%9D%29%20%E2%86%92%20%E2%84%9D%20%E2%86%92%20Prop%29%20%28HS%20%3A%20is_sum%20S%29%20%28t%20%3A%20%E2%84%9D%29%20%28Ht%20%3A%20S%20a%20t%29%20%3A%20%0A%20%20S%20a%20s%20%3A%3D%20%0Aby%20rwa%20%28H%20S%20HS%20t%20Ht%29.symm%20%0A%0Atheorem%20has_sum_alt%20%3A%20has_sum%20%28%CE%BB%20n%2C%20%28-1%29%20%5E%20n%29%20%281%2F2%29%20%3A%3D%0Abegin%0A%20%20intros%20S%20HS%20t%20Ht%2C%0A%20%20have%20H3%20%3A%3D%20HS.sum_shift%20Ht%2C%0A%20%20have%20H2%20%3A%3D%20HS.sum_smul%20%28-1%29%20Ht%2C%0A%20%20have%20H0%20%3A%3D%20HS.wd%20H2%20H3%2C%0A%20%20change%20_%20%3D%20t%20-%201%20at%20H0%2C%0A%20%20linarith%2C%0Aend%0A%0Atheorem%20has_sum_alt_id%20%3A%20has_sum%20%28%CE%BB%20n%2C%20%28-1%29%20%5E%20n%20*%20n%29%20%28-1%2F4%29%20%3A%3D%0Abegin%0A%20%20intros%20S%20HS%20t%20Ht%2C%0A%20%20have%20HC%20%3A%20%E2%88%80%20n%20%3A%20%E2%84%95%2C%20%28-1%20%3A%20%E2%84%9D%29%20%5E%20%28n%20%2B%201%29%20*%20%28n%20%2B%201%20%3A%20%E2%84%95%29%20%2B%20%28-1%29%20%5E%20n%20*%20n%20%3D%20%28-1%29%20*%20%28-1%29%20%5E%20n%20%3A%3D%20%0A%20%20%20%20%CE%BB%20n%2C%20by%20rw%20%5Bpow_succ%2C%20nat.cast_add%2C%20mul_add%2C%20nat.cast_one%2C%20mul_one%2C%20add_comm%2C%20%E2%86%90add_assoc%2C%20neg_one_mul%2C%0A%20%20%20%20neg_mul_eq_neg_mul_symm%2C%20add_neg_self%2C%20zero_add%5D%2C%20%0A%20%20have%20H3%20%3A%3D%20HS.sum_shift%20Ht%2C%0A%20%20have%20H1%20%3A%3D%20HS.sum_add%20H3%20Ht%2C%0A%20%20have%20H2%20%3A%3D%20HS.sum_smul%20%28-1%29%20H1%2C%0A%20%20simp%20only%20%5Bnat.cast_zero%2C%20mul_zero%2C%20sub_zero%2C%20HC%2C%20neg_one_mul%2C%20neg_neg%5D%20at%20H2%2C%0A%20%20have%20H4%20%3A%3D%20has_sum_alt%20S%20HS%20_%20H2%2C%0A%20%20linarith%2C%0Aend%0A%0Adef%20fib%20%3A%20%E2%84%95%20%E2%86%92%20%E2%84%9D%0A%7C%200%20%3A%3D%200%0A%7C%201%20%3A%3D%201%0A%7C%20%28n%20%2B%202%29%20%3A%3D%20fib%20n%20%2B%20fib%20%28n%20%2B%201%29%0A%0Atheorem%20has_sum_fib%20%3A%20has_sum%20fib%20%28-1%29%20%3A%3D%0Ahave%20HC%20%3A%20%E2%88%80%20n%2C%20fib%20n%20%2B%20fib%20%28n%20%2B%201%29%20%3D%20fib%20%28n%20%2B%202%29%20%3A%3D%20%CE%BB%20n%2C%20rfl%2C%0Abegin%0A%20%20intros%20S%20HS%20t%20Ht%2C%0A%20%20have%20H3%20%3A%3D%20HS.sum_shift%20Ht%2C%0A%20%20have%20H33%20%3A%3D%20HS.sum_shift%20H3%2C%0A%20%20have%20H1%20%3A%3D%20HS.sum_add%20Ht%20H3%2C%0A%20%20have%20H0%20%3A%3D%20HS.wd%20H1%20H33%2C%20--%20can%20use%20linearity%20instead%20of%20wd%0A%20%20simp%20only%20%5Bfib%2C%20sub_zero%5D%20at%20H0%2C%0A%20%20linarith%2C%0Aend%0A%0A--%20if%20a%20sequence%20has%20two%20has_sums%2C%20everything%20is%20its%20sum%20%0A--%20%28this%20is%20the%20case%20of%20not%20being%20summable%2C%20e.g.%201%2B1%2B1%2B...%29%0Atheorem%20has_sum_test_un%20%28a%20%3A%20%E2%84%95%20%E2%86%92%20%E2%84%9D%29%20%28s%20t%20%3A%20%E2%84%9D%29%20%28hst%20%3A%20s%20%E2%89%A0%20t%29%20%3A%20%0A%20%20has_sum%20a%20s%20%E2%86%92%20has_sum%20a%20t%20%E2%86%92%20%E2%88%80%20s%2C%20has_sum%20a%20s%20%3A%3D%0A%CE%BB%20hs%20ht%20u%20S%20HS%20v%20Hv%2C%20false.elim%20%28hst%20%28ht%20S%20HS%20s%20%28sum_of_has_sum%20a%20s%20hs%20S%20HS%20v%20Hv%29%29%29%0A%0Aopen%20submodule%0A%0A--%20a%20sum%20operator%20that%20is%20%22forced%22%20to%20give%20a%20the%20sum%20s%0A--%20a%20valid%20sum%20operator%20iff%20the%20shifts%20of%20a%20are%20linearly%20independent%0A--%20in%20which%20case%20a%20can%20have%20any%20sum%2C%20and%20thus%20has_sum%20nothing%0Adef%20forced_sum%20%28s%20%3A%20%E2%84%95%20%E2%86%92%20%E2%84%9D%29%20%28H%20%3A%20linear_independent%20%E2%84%9D%20%28%CE%BB%20m%20n%20%3A%20%E2%84%95%2C%20s%20%28n%20%2B%20m%29%29%29%20%28S%20%3A%20%E2%84%9D%29%20%3A%20%28%E2%84%95%20%E2%86%92%20%E2%84%9D%29%20%E2%86%92%20%E2%84%9D%20%E2%86%92%20Prop%20%3A%3D%0A%CE%BB%20t%20T%2C%0Aif%20Ht%20%3A%20t%20%E2%88%88%20span%20%E2%84%9D%20%28set.range%20%28%CE%BB%20m%20n%20%3A%20%E2%84%95%2C%20s%20%28n%20%2B%20m%29%29%29%20then%0A%20%20if%20T%20%3D%20finsupp.sum%20%28linear_independent.repr%20H%20%E2%9F%A8t%2C%20Ht%E2%9F%A9%29%20%28%CE%BB%20n%20r%2C%20r%20*%20%28S%20-%20finset.sum%20%28%40finset.univ%20%28fin%20n%29%20_%29%20%28%CE%BB%20k%2C%20s%20k.val%29%29%29%20%0A%20%20%20%20then%20true%20%0A%20%20else%20false%0Aelse%20false%0A%0A--%20linear%20algebra%20lemma%0Alemma%20spanning_set_subset_span%20%7BR%20M%20%3A%20Type%7D%20%5Bring%20R%5D%20%5Badd_comm_group%20M%5D%20%5Bmodule%20R%20M%5D%20%7Bs%20%3A%20set%20M%7D%20%3A%0A%20%20%E2%88%80%20x%20%E2%88%88%20s%2C%20x%20%E2%88%88%20span%20R%20s%20%3A%3D%20span_le.mp%20%28le_refl%20_%29%0A%0A--%20forced_sum%20does%20what%20we%20want%0Alemma%20forced_sum_val%20%28s%20%3A%20%E2%84%95%20%E2%86%92%20%E2%84%9D%29%20%28H%20%3A%20linear_independent%20%E2%84%9D%20%28%CE%BB%20m%20n%20%3A%20%E2%84%95%2C%20s%20%28n%20%2B%20m%29%29%29%20%28S%20%3A%20%E2%84%9D%29%20%3A%20%0A%20%20forced_sum%20s%20H%20S%20s%20S%20%3A%3D%0Abegin%0A%20%20have%20Hs%E2%82%81%20%3A%20s%20%E2%88%88%20set.range%20%28%CE%BB%20m%20n%20%3A%20%E2%84%95%2C%20s%20%28n%20%2B%20m%29%29%20%3A%3D%20set.mem_range.mpr%20%E2%9F%A80%2C%20by%20simp%20only%20%5Badd_zero%5D%E2%9F%A9%2C%0A%20%20have%20Hs%E2%82%82%20%3A%20s%20%E2%88%88%20span%20%E2%84%9D%20%28set.range%20%28%CE%BB%20m%20n%20%3A%20%E2%84%95%2C%20s%20%28n%20%2B%20m%29%29%29%20%3A%3D%20spanning_set_subset_span%20_%20Hs%E2%82%81%2C%0A%20%20unfold%20forced_sum%2C%0A%20%20split_ifs%2C%0A%0Aend%0A%0A--%20i%20guess%20this%20is%20the%20hard%20part%0Alemma%20is_sum_forced_sum%20%28s%20%3A%20%E2%84%95%20%E2%86%92%20%E2%84%9D%29%20%28H%20%3A%20linear_independent%20%E2%84%9D%20%28%CE%BB%20m%20n%20%3A%20%E2%84%95%2C%20s%20%28n%20%2B%20m%29%29%29%20%28S%20%3A%20%E2%84%9D%29%20%3A%0A%20%20is_sum%20%28forced_sum%20s%20H%20S%29%20%3A%3D%0A%E2%9F%A8begin%20end%2C%0A%20begin%20end%2C%0A%20begin%20end%2C%0A%20begin%20end%E2%9F%A9%0A%0Atheorem%20no_sum_of_lin_ind_shifts%20%28s%20%3A%20%E2%84%95%20%E2%86%92%20%E2%84%9D%29%20%28H%20%3A%20linear_independent%20%E2%84%9D%20%28%CE%BB%20m%20n%20%3A%20%E2%84%95%2C%20s%20%28n%20%2B%20m%29%29%29%20%3A%20%0A%20%20%E2%88%80%20S%20%3A%20%E2%84%9D%2C%20%C2%AC%20has_sum%20s%20S%20%3A%3D%0A%CE%BB%20S%20HS%2C%20%0Ahave%20X%20%3A%20_%20%3A%3D%20HS%20%28forced_sum%20s%20H%20%28S%20%2B%201%29%29%20%28is_sum_forced_sum%20s%20H%20%28S%20%2B%201%29%29%20%28S%20%2B%201%29%20%28forced_sum_val%20s%20H%20%28S%20%2B%201%29%29%2C%0Aby%20linarith%0A%0A--%20CHALLENGE%3A%20formalise%20the%20proof%20here%3A%0A--%20https%3A%2F%2Fleanprover.zulipchat.com%2F%23narrow%2Fstream%2F116395-maths%2Ftopic%2FAxiomatised.20summations%2Fnear%2F178884724%0A--%20REQUIRES%20GENERATING%20FUNCTIONS%2C%20TAYLOR%20SERIES%20--%20not%20currently%20in%20Lean!%0Atheorem%20inv_shifts_lin_ind%20%3A%20linear_independent%20%E2%84%9D%20%28%CE%BB%20m%20n%20%3A%20%E2%84%95%2C%201%20%2F%20%28n%20%2B%20m%20%2B%201%29%29%20%3A%3D%0Abegin%0A%0Aend" rel="nofollow noreferrer">play with it yourself</a>. And check out the challenge (proving that there exists a sequence that does <em>not</em> have a sum (see the proof in math <a href="https://leanprover.zulipchat.com/#narrow/stream/116395-maths/topic/Axiomatised.20summations/near/178884724" rel="nofollow noreferrer">here</a>). Actually providing an example (e.g. <span class="math-container">$1/n$</span>) may be quite hard (the proof in the chat uses generating functions, which should be hard in Lean), but proving that a sequence with linearly independent shifts has no sum should definitely be doable.</p>
<hr>
<p><strong>OLD ANSWER:</strong></p>
<p>Here's something to get you started -- I wrote it in Lean, a formal proof-checker, because these things are tricky and I wanted to be completely sure I was being rigorous. I suppose we also need <code>sum_con</code> for convergent sums, but I'm not sure where infinite series are in the Lean math library!</p>
<pre><code>[old code redacted]
</code></pre>Mon, 21 Oct 2019 13:47:08 GMThttps://math.stackexchange.com/questions/3402550/-/3402796#3402796Abhimanyu Pallavi Sudhir2019-10-21T13:47:08ZComment by Abhimanyu Pallavi Sudhir on Axiomatic Treatment Of Sum Operators Which Work On Divergent Series
https://math.stackexchange.com/questions/3402550/axiomatic-treatment-of-sum-operators-which-work-on-divergent-series
The thing is, most such proofs of sums simply assume that the given sequence is summable. E.g. when proving that such an axiomatised sum of $(1,-1,1,-1...)$ is $1/2$, you don't apply some axioms and reduce the problem to a convergent sequence -- you simply show that if it has a sum, the sum must be $1/2$.Mon, 21 Oct 2019 11:32:49 GMThttps://math.stackexchange.com/questions/3402550/axiomatic-treatment-of-sum-operators-which-work-on-divergent-series?cid=6999088Abhimanyu Pallavi Sudhir2019-10-21T11:32:49ZComment by Abhimanyu Pallavi Sudhir on Axiomatic Treatment Of Sum Operators Which Work On Divergent Series
https://math.stackexchange.com/questions/3402550/axiomatic-treatment-of-sum-operators-which-work-on-divergent-series
Note that your axioms do not prove $\sum n = -1/12$, which requires removing an infinite number of zeroes (but this has other problems). I once tried formalising something like this -- the main trouble (if we're fine with discarding $-1/12$) is in choosing the set of "summable" sequences on which the operators are defined.Mon, 21 Oct 2019 11:29:55 GMThttps://math.stackexchange.com/questions/3402550/axiomatic-treatment-of-sum-operators-which-work-on-divergent-series?cid=6999085Abhimanyu Pallavi Sudhir2019-10-21T11:29:55ZAnswer by Abhimanyu Pallavi Sudhir for What is the motivation of uniform continuity?
https://math.stackexchange.com/questions/457008/what-is-the-motivation-of-uniform-continuity/3402176#3402176
0<p>One motivation comes from non-standard analysis, i.e. analysis with hyperreal numbers. This view is actually very useful (makes things obvious) when looking at e.g. the uniform limit theorem (the relationship to uniform convergence).</p>
<p>Here, a real function <span class="math-container">$f$</span> is continuous at <span class="math-container">$x$</span> if <span class="math-container">$\hat{f}(x+\varepsilon)-\hat{f}(x)$</span> is infinitesimal for all infinitesimal <span class="math-container">$\varepsilon$</span>. </p>
<p>A real function is <em>uniformly continuous</em> if it is <strong>continuous for all hyperreal <span class="math-container">$x$</span></strong> -- whereas a continuous function only needs to be continuous at real values of <span class="math-container">$x$</span>. </p>
<p>So it's obvious why <span class="math-container">$x^2$</span> is not uniformly continuous -- at <span class="math-container">$\omega$</span>, it turns increments by <span class="math-container">$1/\omega$</span> into increments by <span class="math-container">$1$</span>. Or why <span class="math-container">$1/x$</span> isn't uniformly continuous on the positive reals -- at <span class="math-container">$\varepsilon$</span>, it turns increments by <span class="math-container">$\varepsilon$</span> into increments by <span class="math-container">$1/\varepsilon$</span>. It also explains why <span class="math-container">$\sqrt{x}$</span> <em>is</em> continuous on the positive reals -- although it turns <span class="math-container">$\varepsilon$</span> into <span class="math-container">$\sqrt{\varepsilon}$</span>, which has "higher order" -- <em>that's still an infinitesimal</em>.</p>
<p>In real number speak, this just says that for any two sequences st. <span class="math-container">$x_n-y_n\to 0$</span>, <span class="math-container">$f(x_n)-f(y_n)\to 0$</span> (which is really the "sequential" form of stating uniform continuity). By contrast for continuity, this is only required with constant sequences <span class="math-container">$y_n$</span>.</p>Mon, 21 Oct 2019 00:06:46 GMThttps://math.stackexchange.com/questions/457008/-/3402176#3402176Abhimanyu Pallavi Sudhir2019-10-21T00:06:46ZComment by Abhimanyu Pallavi Sudhir on Countable dense subsets of $\mathbb R$ are homeomorphic
https://math.stackexchange.com/questions/1238572/countable-dense-subsets-of-mathbb-r-are-homeomorphic/1238601#1238601
The order part is easy -- it's the bijection that's hard.Sun, 20 Oct 2019 12:59:46 GMThttps://math.stackexchange.com/questions/1238572/countable-dense-subsets-of-mathbb-r-are-homeomorphic/1238601?cid=6996702#1238601Abhimanyu Pallavi Sudhir2019-10-20T12:59:46ZAnswer by Abhimanyu Pallavi Sudhir for If $S$ is an infinite $\sigma$ algebra on $X$ then $S$ is not countable
https://math.stackexchange.com/questions/320035/if-s-is-an-infinite-sigma-algebra-on-x-then-s-is-not-countable/3396962#3396962
0<p><strong><a href="https://thewindingnumber.blogspot.com/2019/10/sigma-fields-are-venn-diagrams.html" rel="nofollow noreferrer">Sigma algebras are just Venn diagrams.</a></strong> (with some caveats because of all the "<em>countable</em> union" business)</p>
<p>A sigma field <span class="math-container">$\mathcal{F}$</span> on <span class="math-container">$X$</span> defines an equivalence relation on <span class="math-container">$X$</span> where <span class="math-container">$x\sim y$</span> iff <span class="math-container">$\forall E\in \mathcal{F},x\in E\iff y\in E$</span>. This partition is just the partition defined by the Venn diagram -- by the little intersection regions. The important point is that there is a bijection <span class="math-container">$\mathcal{F}\leftrightarrow \mathcal{P}(X/\sim)$</span> -- this should also be obvious with the Venn diagrams.</p>
<p>So what are the possible values for the cardinality of a power set?</p>Thu, 17 Oct 2019 01:10:41 GMThttps://math.stackexchange.com/questions/320035/-/3396962#3396962Abhimanyu Pallavi Sudhir2019-10-17T01:10:41ZSigma fields are Venn diagrams
https://thewindingnumber.blogspot.com/2019/10/sigma-fields-are-venn-diagrams.html
0The starting point for probability theory will be to note the difference between <b>outcomes</b> and <b>events</b>.<br /><br />An <b>outcome</b> of an experiment is a fundamentally non-empirical notion, about our theoretical understanding of what states a system may be in -- it is, in a sense, analogous to the "microstates" of statistical physics. The set of all <i>outcomes</i> $x$ is called the <b>sample space</b> $X$, and is the fundamental space to which we will give a probabilistic structure (we will see what this means).<br /><br />Our actual observations, the events, need not be so precise -- for example, our measurement device may not actually measure the exact sequence of heads and tails as the result of an experiment, but only the total number of heads, or something -- analogous to a "macrostate". But these measurements <i>are</i> statements about what microstates we know are possible for our system to be in -- i.e. they correspond to sets of outcomes. These sets of outcomes that we can "talk about" are called <b>events</b> $E$, and the set of all possible events is called a <b>field</b> $\mathcal{F}\subseteq 2^X$.<br /><br />For instance: if our sample space is $\{1,2,3,4,5,6\}$ and our measurement apparatus is a guy who looks at the reading and tells us if it's even or odd, then the field is $\{\varnothing, \{1,3,5\},\{2,4,6\},X\}$. We simply <i>cannot</i> talk about sets like $\{1,3\}$ or $\{1\}$. Our information just doesn't tell us anything about sets like that -- when we're told "odd", we're never hinted if the outcome was 1 or 3 or 5, so we can't even have prior probabilities -- we can't even give probabilities to whether a measurement was a 1 or a 3.<br /><br />Well, what kind of properties characterise a field? There's actually a bit of ambiguity in this -- it's clear that a field should be closed under <b>negation and <i>finite</i> unions</b> (and finite intersections follow via de Morgan) -- if you can talk about whether $P_1$ and $P_2$ are true, you can check each of them to decide if $P_1\lor P_2$ is true (and since a proposition $P$ corresponds to a set $S$ in the sense that $P$ says "one of the outcomes in $S$ is true", $\lor$ translates to $\cup$). But if you have an infinite number of $P_i$'s, can you really check each one of them so that you can say without a doubt that a field is closed under arbitrary union?<br /><br />Well, this is (at this point) really a matter of convention, but we tend to choose the convention where the field is closed under <b>negation and <i>countable</i> unions</b>. Such a field is called a <b>sigma-field</b>. We will actually see where this convention comes from (and why it is actually important) when we define probability -- in fact, it is required for the idea that one may have a uniform probability distribution on a compact set in $\mathbb{R}^n$.<br /><br /><hr /><br />A beautiful way to understand fields and sigma fields is in terms of venn diagrams -- in fact, as you will see, <b>fields are precisely a formalisation of Venn diagrams</b>. I was pretty amazed when I discovered this (rather simple) connection for myself, and you should be too.<br /><br />Suppose your experiment is to toss three coins, and make "partial measurements" on the results through three "measurement devices":<br /><ul><li><b>A:</b> Lights up iff the number of heads was at least 2.</li><li><b>B:</b> Lights up iff the first two coins landed heads.</li><li><b>C:</b> Lights up iff the third coin landed heads.</li></ul>What this means is that $A$ gives you the set $\{HHT, HTH, THH, HHH\}$, $B$ gives you the set $\{HHH, HHT\}$, $C$ gives you the set $\{HHH, HTH, THH, TTH\}$. Based on precisely which devices light up, you can decide the truth values of $\lnot$'s and $\lor$'s of these statements, i.e. complements and unions of these sets -- this is the point of fields, of course.<br /><br />Or we could visualise things.<br /><br /><div class="separator" style="clear: both; text-align: center;"></div><div class="separator" style="clear: both; text-align: center;"><a href="https://1.bp.blogspot.com/-_GSgU8wZr3o/XaXXGandEiI/AAAAAAAAFxI/DgbNQDmdnMgiiFFaxcM9_x8h37s0F5nVQCEwYBhgL/s1600/venn%2Bsigma.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="588" data-original-width="948" height="247" src="https://1.bp.blogspot.com/-_GSgU8wZr3o/XaXXGandEiI/AAAAAAAAFxI/DgbNQDmdnMgiiFFaxcM9_x8h37s0F5nVQCEwYBhgL/s400/venn%2Bsigma.png" width="400" /></a></div>Well, the Venn diagram produces a partition of $X$ corresponding to the equivalence relation of "indistinguishability", i.e. "every event containing one outcome contains the other"? The <i>field</i> consists precisely of any set one can "mark" on the Venn diagram -- i.e. unions of the elements of the partition.<br /><br />A consequence of this becomes immediately obvious:<br /><br /><b>Given a field $\mathcal{F}$ corresponding to the partition $\sim$, the following bijection holds: $\mathcal{F}\leftrightarrow 2^{X/\sim}$.</b><br /><br />Consequences of this include: the cardinalities of finite sigma fields are precisely the powers of two; there is no countably infinite finite field.<br /><br /><hr /><br />Often, one may want to some raw data from an experiment to obtain some processed data. For example, let $X=\{HH,HT,TH,TT\}$ and the initial measurement is of the number of heads:<br /><br />$$\begin{align}<br />\mathcal{F}=&\{\varnothing, \{TT\}, \{HT, TH\}, \{HH\},\\<br />& \{TT, HT, TH\}, \{TT, HH\}, \{HT, TH, HH\}, X \}<br />\end{align}$$<br />What kind of properties of the outcome can we talk about with certainty given the number of heads? For example, we can talk about the question "was there at least one heads?"<br /><br />$$\mathcal{G}=\{\varnothing, \{TT\}, \{HT, TH, HH\}, X\}$$<br />There are two ways to understand this "processing" or "re-measuring". One is as a function $f:\frac{X}{\sim_\mathcal{F}}\to \frac{X}{\sim_\mathcal{G}}$. Recall that:<br /><br />$$\begin{align}<br />\frac{X}{\sim_\mathcal{F}}&=\{\{TT\},\{HT,TH\},\{HH\}\}\\<br />\frac{X}{\sim_\mathcal{G}}&=\{\{TT\},\{HT,TH,HH\}\}<br />\end{align}$$<br />Any such $f$ is a permissible "<b>measurable function</b>", as long as $\sim_\mathcal{G}$ is at least as coarse a partition as $\sim_\mathcal{F}$. In other words, a function from $X/\sim_1$ to $(X/\sim_1)/\sim_2$ is always measurable.<br /><br />But there's another, more "natural", less weird and mathematical way to think about a re-measurement -- as a function $f:X\to Y$, where in this case $Y=\{0,1\}$ where an outcome maps to 1 if it has at least one heads, and 0 if it does not.<br /><br />But there's a catch: knowing that an event $E_Y$ in $Y$ occurred is equivalent to knowing that <i>an</i> outcome in $X$ mapping to $E_Y$ occurred -- i.e. that the event $\{x\in X\mid f(x)\in Y\}$ occurred. Such an event must be in the field on $X$, i.e.<br /><br />$$\forall y\in\mathcal{F}_Y,f^{-1}(y)\in\mathcal{F}_X$$<br />This is the condition for a <b>measurable function</b>, also known as a <b>random variable</b>.<br /><br /><hr /><br />One may observe certain analogies between the measurable spaces outlined above, and topology -- in the case of countable sample spaces, there actually is a correspondence. The similarity between a Venn diagram and casual drawings of a topological space is not completely superficial.<br /><br />The key idea behind fields is mathematically a notion of "distinguishability" -- if all we can measure is the number of heads, $HHTTH$ and $TTHHH$ are identical to us. For all practical purposes, we can view the sample space as the partition by this equivalence relation. They are basically the "same point".<br /><br />It's this notion that a <b>measurable function</b> seeks to encapsulate -- it is, in a sense, a <b>generalisation of a function</b> from set theory. A function <b>cannot distinguish indistinguishable points</b> -- in set theory, "indistinguishability" is just equality, the discrete partition; a measurable function <b>cannot distinguish indistinguishable points</b> -- but in measurable spaces, "indistinguishability" is given by some equivalence relation.<br /><br />Let's see this more precisely.<br /><br />Given sets with equivalence relations $(X,\sim)$, $(Y,\sim)$, we want to ensure that some function $f:X\to Y$ "lifts" to a function $f:\frac{X}{\sim}\to\frac{Y}{\sim}$ such that $f([x])=[f(y)]$. <br /><br /><b>(Exercise:</b> Show that this (i.e. this "definition" being well-defined) is equivalent to the condition $\forall E\in\mathcal{F}_Y, f^{-1}(E)\in \mathcal{F}_X$. It may help to draw out some examples.)<br /><br />Well, this expression of the condition -- as $f([x])=[f(y)]$ -- even if technically misleading (the two $f$'s aren't really the same thing) give us the interpretation that a measurable function is one that <i>commutes with the partition</i> or <i>preserves the partition</i>.<br /><br />While homomorphisms in other settings than measurable spaces do not precisely follow the "cannot distinguish related points" notion, they do follow a generalisation where equivalence relations are replaced with other relations, operations, etc. -- in topology, a continuous function preserves limits; in group theory, a group homomorphism preserves the group operation; in linear algebra, a linear transformation preserves linear combinations; in order theory, an increasing function preserves order, etc. In any case, a homomorphism is a function that does not "break" relationships by creating a "finer" relationship on the target space.measurable functionprobabilityprobability theoryrandom variablessigma fieldvenn diagramThu, 17 Oct 2019 00:57:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-3521631282555558856Abhimanyu Pallavi Sudhir2019-10-17T00:57:00ZComment by Abhimanyu Pallavi Sudhir on Schrödinger equation derivation and Diffusion equation
https://physics.stackexchange.com/questions/144832/schr%c3%b6dinger-equation-derivation-and-diffusion-equation/145217#145217
But in any case, this is going off a tangent -- my point is that the claim in your answer that "the Schrodinger equation is a wave equation" is not a useful one, especially for this question, which explicitly asks if the formal relation between the diffusion equation and the Schrodinger equation. The observation that the Schrodinger equation admits sinusoidal solutions is not a particularly enlightening one, nor is it very revealing to point out that the classical diffusion equation doesn't.Sun, 13 Oct 2019 08:35:01 GMThttps://physics.stackexchange.com/questions/144832/schr%c3%b6dinger-equation-derivation-and-diffusion-equation/145217?cid=1144660#145217Abhimanyu Pallavi Sudhir2019-10-13T08:35:01ZComment by Abhimanyu Pallavi Sudhir on Schrödinger equation derivation and Diffusion equation
https://physics.stackexchange.com/questions/144832/schr%c3%b6dinger-equation-derivation-and-diffusion-equation/145217#145217
Sorry, but your definition makes no sense -- e.g. linear combinations of such solutions are also waves. But I don't deny that you <i>can</i> make a definition in your sense, just that it's very conceptually useful. It may be conceptually useful to classify the "higher-order derivatives in $x$" cases as waves if they are to be understood as "corrections" of an ordinary wave of sorts, I don't know. You can replace my definition with $\partial_\mu\partial_\nu\boldsymbol{\Psi}=0$ if you like.Sun, 13 Oct 2019 08:31:30 GMThttps://physics.stackexchange.com/questions/144832/schr%c3%b6dinger-equation-derivation-and-diffusion-equation/145217?cid=1144657#145217Abhimanyu Pallavi Sudhir2019-10-13T08:31:30ZComment by Abhimanyu Pallavi Sudhir on Schrödinger equation derivation and Diffusion equation
https://physics.stackexchange.com/questions/144832/schr%c3%b6dinger-equation-derivation-and-diffusion-equation/145217#145217
What are you talking about? What's your definition of a wave? You can invent an obfuscated definition of a "wave" under which the Schrodinger equation is a "wave equation", but it would still be <i>conceptually different</i> from the wave equation $\partial^2\psi/\partial x^2=\partial^2\psi/\partial t^2$. Physically <i>fundamentally different</i> equations ought to be called different names, even if some specific solutions appear similar to you -- this isn't "arbitrary".Sat, 12 Oct 2019 16:00:14 GMThttps://physics.stackexchange.com/questions/144832/schr%c3%b6dinger-equation-derivation-and-diffusion-equation/145217?cid=1144458#145217Abhimanyu Pallavi Sudhir2019-10-12T16:00:14ZNon-surjectivity of exponential map: how to understand?
https://math.stackexchange.com/questions/3383612/non-surjectivity-of-exponential-map-how-to-understand
0<p>I'm given to understand the exponential map is not generally surjective -- the standard example is <span class="math-container">$\mathrm{SL}(\mathbb{R}^2)$</span> <a href="https://math.stackexchange.com/questions/643216/non-surjectivity-of-the-exponential-map-to-sl2-mathbbc">[ 1 ]</a>. </p>
<p>I can clearly see why this is so in the non-connected case -- the tangent space is a tangent space to the connected component alone, so its image must be contained in the connected component. <strong>I do not see why the map isn't surjective in the connected case.</strong></p>
<p>I also don't see why the map is then <em>again</em> surjective in the compact case -- <a href="https://en.wikipedia.org/wiki/Maximal_torus#Properties" rel="nofollow noreferrer">wikipedia</a> claims that this is a special case of "the exponential map is surjective if every element is contained in a maximal torus". Is this right? Is there a good way to understand why this is true?</p>
<hr>
<p>Note that I am not looking for counter-examples: I'm aware of them. I'm looking for intuition -- perhaps a clever look at what the image of the exponential map actually looks like in the non-surjective case (how it "misses" some of the points in the group). </p>
<p>As an analogy, if asked to explain smooth non-analytic functions, it would be more instructive (than simply providing the example of <span class="math-container">$e^{-1/x}$</span>) to explain that a function may grow slower than all polynomials near zero -- and provide the construction as <span class="math-container">$1/f(1/x)$</span> from any function <span class="math-container">$f$</span> that grows faster than all polynomials as <span class="math-container">$x\to\infty$</span>.</p>
<p>(See <a href="https://math.stackexchange.com/questions/3368390/developing-intuition-for-lie-groups-and-lie-algebras">here</a> for more examples of the kind of intuition I'm looking for, within the context of Lie theory.)</p>lie-groupslie-algebrasMon, 07 Oct 2019 00:43:08 GMThttps://math.stackexchange.com/q/3383612Abhimanyu Pallavi Sudhir2019-10-07T00:43:08ZThe Killing form; factorising non-Abelian Lie groups
https://thewindingnumber.blogspot.com/2019/10/the-killing-form-factorising-non.html
0It could be fun to try and define a "dot product" on a Lie algebra.<br /><br />You know, you might've already realised that the cross product is a Lie bracket of sorts -- you know, given its antisymmetry and the whole $a^\mu b^\nu - a^\nu b^\mu$ representation of the wedge product and all that. It's a short exercise to verify that the Lie algebra $\mathfrak{so}(3)$ of $SO(3)$ is the algebra of skew-symmetric matrices, and with the Lie bracket $XY-YX$ is isomorphic to $\mathbb{R}^3$ with the cross product.<br /><br />Well, the dot product on $\mathbb{R}^3$ has an interesting connection to $SO(3)$ -- it is precisely the form that is invariant under the action of $SO(3)$. Well, but that's $SO(3)$ acting on $\mathbb{R}^3$ -- what is that action in the notation of $\mathfrak{so}(3)$? As it turns out (and you can work this out), it is <b>precisely the adjoint map</b> $\mathrm{Ad}_gX:=gXg^{-1}$ which corresponds to this "rotating $X$ by $g$". It's not really that unexpected, if you ask me -- conjugation is always the natural way to transform matrices in linear algebra when vectors are multiplied on the left.<br /><br />So the "dot product" is an $\mathrm{Ad}$-invariant bilinear form. In fact, adding a symmetricity requirement allows us to just bother with norms (as a symmetric inner product can be determined from the norm, through the cosine rule). Conjugation basically allows you to determine the "<b>contours</b>" of this norm or inner product. The question is: can we determine the bilinear form -- up to scaling -- just from "<b>$\mathrm{Ad}$-invariant symmetric bilinear form</b>" alone?<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://1.bp.blogspot.com/-5hjfT6Pw29w/XYyu1CJnKtI/AAAAAAAAFvQ/Fq0LC1Bgqv0QBcW8I3_5DabJkyLOX254QCLcBGAsYHQ/s1600/conjugation%2Bcontour.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="461" data-original-width="601" height="244" src="https://1.bp.blogspot.com/-5hjfT6Pw29w/XYyu1CJnKtI/AAAAAAAAFvQ/Fq0LC1Bgqv0QBcW8I3_5DabJkyLOX254QCLcBGAsYHQ/s320/conjugation%2Bcontour.png" width="320" /></a></div><br />This is equivalent to asking "<i>is the orbit of some non-zero $X$ under conjugation by $G$ equal to $\mathfrak{g}$?</i>" (so that the norm of that $X$ would suffice to determine all norms -- do you see why?) Well, this is equivalent to asking "<i>is $X$ contained in some non-trivial ideal?</i>" (prove that these are equivalent!), and this is equivalent to asking "<i>does $\mathfrak{g}$ have any non-trivial ideals?</i>" (do you see why?)<br /><br />A Lie algebra without nontrivial ideals is called a <b>simple Lie algebra</b>. Our demonstration above shows that a simple Lie algebra has a unique $\mathrm{Ad}$-invariant symmetric bilinear form, determined by the value of $\langle X, X\rangle$ for some non-zero $X$.<br /><br /><div class="twn-furtherinsight">Even before we actually derive what this form must look like, we can derive one important consequence of automorphism invariance: $\langle X, [X, Y]\rangle = 0$ (prove it!), i.e. the tangent to an automorphism curve is perpendicular to the position vector at every point. The understanding of the group as acting as a "rotation group" on its Lie algebra in the adjoint representation really makes sense!</div><br /><div class="twn-beg">Someone tell me if they know how one may "derive" the trace-form formula from this characterisation rather than pulling it out of the blue and <em>then</em> proving it is the unique $\mathrm{Ad}$-invariant symmetric bilinear form. Here's something I started to write:<br /><br />Here's an idea for the base length (i.e. to define the scaling): $X$ has length 1 iff the length of $[X,Y]$ equals the length of $Y$ for all $Y$ perpendicular to $X$ -- equivalently: $\forall V\in\mathfrak{g}, |[X,[X,V]]|=|[X,V]|$. We need to check that this condition is well-defined, i.e. that:<br /><ol><li>Given an $X$, $|[X,[X,U]]|=|[X,U]|$ for some $U$ not a multiple of $X$ implies that $|[X,[X,V]]|=|[X,V]|$ for all $V$.</li><li>$X$ satisfying $|[X,[X,V]]|=|[X,V]|$ implies that all conjugates $gXg^{-1}$ of it satisfy it too. This is trivial from considering $V=gV'g^{-1}$ (since the identity is true for all $V$).</li></ol>Is the first one even true outside $\mathfrak{so}(3)$ -- for all simple Lie algebras?</div><br />One may come up with the idea of defining a form $\langle X, Y\rangle = \mathrm{tr}[X,[Y,\cdot]]$ (example of some weak motivation -- the vector triple product $x\times(x\times v)$ has as eigenvectors the vectors $v$ perpendicular to $x$ and the eigenvalues depend on the length of $x$) and check that this is indeed an $\mathrm{Ad}$-invariant symmetric bilinear form, and is thus unique up to scaling for simple Lie algebras. This form is called the <b>Killing form</b>.<br /><br /><hr /><br /><b>Factorisation of Lie groups</b><br /><br />We have seen the classification of connected Abelian Lie groups: they are products of circles and lines. We wonder if such a classification is possible for more general Lie groups.<br /><br />The natural way to "factorise" groups by taking quotients over normal subgroups -- we wonder if this means that all Lie groups can be written as direct products of <b>simple Lie groups</b> (groups that don't have a nontrivial connected normal subgroup -- can you see why "connected" matters?). Well, not really -- the quotients need not be subgroups at all, after all. Instead, the "factorisation" takes the form of what is known as a <b>group extension</b>. A group for which it <i>is</i> a direct product is called a <b>reductive Lie group</b> -- and its Lie algebra is the direct sum of simple Lie algebras, or a <b>reductive Lie algebra</b>.<br /><br /><div class="twn-pitfall">It is more conventional in the literature to define a simple Lie algebra excluding the one-dimensional/abelian case. In this definition, direct sums of simple Lie algebras are <b>semisimple Lie algebras</b>, and reductive Lie algebras are direct sums of semisimple and abelian Lie algebras.</div><br />TBC: Cartan's criterion, solvability, nilpotency<br />killing formlie grouplie theorynormal subgroupTue, 01 Oct 2019 23:05:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-7359488371150159165Abhimanyu Pallavi Sudhir2019-10-01T23:05:00ZComment by Abhimanyu Pallavi Sudhir on Intuition for the Killing form as "automorphism-invariant symmetric bilinear form"
https://math.stackexchange.com/questions/3369402/intuition-for-the-killing-form-as-automorphism-invariant-symmetric-bilinear-for
@Travis Right, thanks -- I got confused for some reason.Wed, 25 Sep 2019 16:44:13 GMThttps://math.stackexchange.com/questions/3369402/intuition-for-the-killing-form-as-automorphism-invariant-symmetric-bilinear-for?cid=6934150Abhimanyu Pallavi Sudhir2019-09-25T16:44:13ZComment by Abhimanyu Pallavi Sudhir on How to derive the general formula for the Killing form for classical Lie algebras?
https://math.stackexchange.com/questions/2578195/how-to-derive-the-general-formula-for-the-killing-form-for-classical-lie-algebra/2578250#2578250
Wait, then why is it not a multiple of $\mathrm{tr}(XY)$ for $\mathfrak{gl}_{\mathbb{R}}(n)$?Wed, 25 Sep 2019 16:32:58 GMThttps://math.stackexchange.com/questions/2578195/how-to-derive-the-general-formula-for-the-killing-form-for-classical-lie-algebra/2578250?cid=6934131#2578250Abhimanyu Pallavi Sudhir2019-09-25T16:32:58ZComment by Abhimanyu Pallavi Sudhir on Intuition for the Killing form as "automorphism-invariant symmetric bilinear form"
https://math.stackexchange.com/questions/3369402/intuition-for-the-killing-form-as-automorphism-invariant-symmetric-bilinear-for
@DietrichBurde Sorry, I changed it to "bilinear form". That's not really the kind of intuition I'm trying to get at, though -- I don't need help motivating a bilinear form.Wed, 25 Sep 2019 14:23:37 GMThttps://math.stackexchange.com/questions/3369402/intuition-for-the-killing-form-as-automorphism-invariant-symmetric-bilinear-for?cid=6933785Abhimanyu Pallavi Sudhir2019-09-25T14:23:37ZIntuition for the Killing form as "automorphism-invariant symmetric bilinear form"
https://math.stackexchange.com/questions/3369402/intuition-for-the-killing-form-as-automorphism-invariant-symmetric-bilinear-for
0<p>Here's my idea for motivating the Killing form: the <em>only notion</em> we have of magnitudes and angles in a Lie algebra comes from conjugations, as they can be understood to be the "natural" transformations on the Lie algebra. So it's natural to ask for a norm map that satisfies <span class="math-container">$\forall g\in G$</span>,</p>
<p><span class="math-container">$$\|X\|=\|\mathrm{Ad}_gX\|$$</span>
And hopefully we can then use symmetry to pin down a bilinear form. The idea is that we can already compare two vectors in the same line, and this condition creates <em>contours</em> that are precisely the <em>orbits of conjugation</em>, which means allowing us to compare vectors in the same ideal. </p>
<p>So in a simple Lie algebra, the bilinear form would then be completely determined up to scaling.</p>
<p>Am I on a sensible track? I guess what I'm asking is:</p>
<ol>
<li>Am I right to believe that "bilinear, symmetric and automorphism-invariant" uniquely determine the Killing form (up to scaling) for simple Lie algebras? </li>
<li>If so, how can I prove the <span class="math-container">$\mathrm{tr}(\mathrm{ad}(x)\mathrm{ad}(y))$</span> formula from this characterisation?</li>
<li>How might I extend this intuition to non-simple Lie algebras? I think I can "see" why the "semisimple equivalent to non-degenerate" property is true, though.</li>
</ol>
<p>(See <a href="https://math.stackexchange.com/questions/3368390/developing-intuition-for-lie-groups-and-lie-algebras">here</a> for examples of the kind of intuition, motivation I'm looking for. Based on advice there, I'm splitting my "intuition for Lie algebras" questions.)</p>lie-groupslie-algebrasautomorphism-groupWed, 25 Sep 2019 13:10:24 GMThttps://math.stackexchange.com/q/3369402Abhimanyu Pallavi Sudhir2019-09-25T13:10:24ZComment by Abhimanyu Pallavi Sudhir on Developing intuition for Lie groups and Lie algebras
https://math.stackexchange.com/questions/3368390/developing-intuition-for-lie-groups-and-lie-algebras
@Travis That makes some sense... Is not being simply connected <i>equivalent</i> to being a discrete quotient of its universal cover?Wed, 25 Sep 2019 04:34:26 GMThttps://math.stackexchange.com/questions/3368390/developing-intuition-for-lie-groups-and-lie-algebras?cid=6932794Abhimanyu Pallavi Sudhir2019-09-25T04:34:26ZComment by Abhimanyu Pallavi Sudhir on Developing intuition for Lie groups and Lie algebras
https://math.stackexchange.com/questions/3368390/developing-intuition-for-lie-groups-and-lie-algebras
@DietrichBurde I know that what you're saying is true -- I just don't have an intuitive understanding of <i>why</i>. Why is simple-connectedness a global property but compactness not? For example, $S^1$ and $\mathbb{R}$ having the same Lie algebra -- why is that due to the fact that $S^1$ isn't simply-connected, rather than because $\mathbb{R}$ isn't compact?Wed, 25 Sep 2019 04:31:00 GMThttps://math.stackexchange.com/questions/3368390/developing-intuition-for-lie-groups-and-lie-algebras?cid=6932789Abhimanyu Pallavi Sudhir2019-09-25T04:31:00ZComment by Abhimanyu Pallavi Sudhir on Developing intuition for Lie groups and Lie algebras
https://math.stackexchange.com/questions/3368390/developing-intuition-for-lie-groups-and-lie-algebras
@Joppy $SL_2(\mathbb{R})$ is connected, see <a href="https://math.stackexchange.com/questions/1010768/prove-that-operatornamesln-r-is-connected">here</a>. As for your explanation of the Killing form -- I've seen this and similar explanations (e.g. "the only associative bilinear operator") before, but I don't really understand why these defining properties are important.Wed, 25 Sep 2019 04:10:58 GMThttps://math.stackexchange.com/questions/3368390/developing-intuition-for-lie-groups-and-lie-algebras?cid=6932774Abhimanyu Pallavi Sudhir2019-09-25T04:10:58ZDeveloping intuition for Lie groups and Lie algebras
https://math.stackexchange.com/questions/3368390/developing-intuition-for-lie-groups-and-lie-algebras
5<p><strong>Background:</strong> Until now, I've been able to <em>motivate</em> everything I've learned in mathematics, and develop some solid insights for everything. But I learned some Lie theory this summer, and while I have a good grasp of the elementary aspects and strong intuition for <em>some</em> or even <em>most</em> of what I've learned, there are some "holes" in my understanding of Lie algebras.</p>
<p>To give you an idea of what I'm looking for, I'll list some examples of things in Lie theory I <strong>DO understand</strong> and am able to motivate:</p>
<ul>
<li>The notion of a <strong>Lie group</strong> itself -- the idea comes from wanting to generalise what we know about discrete groups to more complicated contexts where the "manifold" structure of the group allows us to do so. Examples: <strong>compactness</strong> generalises finiteness, <strong>one-parameter groups</strong> generalise cyclic groups, etc.</li>
<li>The <strong>exponential map</strong> -- For one-parameter groups to generalise cyclic groups, we need a "generalisation" of the group power to allow "real-index powers". The general way to define a <strong>real power</strong> is through the exponential map. Well, this real power stuff isn't <em>always</em> defined as it turns out (you need the exponential map to be surjective), but our motivation does explain why it "makes sense" that the <strong>exponential map is surjective in the connected abelian case</strong> (because then, the Lie algebra is basically a co-ordinate system on the Lie group -- I'm aware exponential co-ordinates are defined in more generality, but it's certainly more well-behaved here).</li>
<li>The <strong>Lie algebra</strong>, i.e. "why is the logarithm/parameter space the tangent space?" We'd like to generalise the notion of a generator to a Lie group -- consider e.g. the circle group on the complex plane. An element near the identity generates a cyclic group, and as the element goes nearer to the identity -- as it becomes an <strong>infinitesimal generator</strong>, the cyclic group it approaches the entire group. Well, an element close to the identity is of the form <span class="math-container">$1+\varepsilon t X$</span>, and generates a group element as <span class="math-container">$(1+\varepsilon tX)^{1/\varepsilon}=e^{tX}$</span>. This is also intuition for the compound-interest limit, and for Euler's identity.</li>
<li>The <strong>Lie bracket</strong> is the second-derivative of the commutator curve <span class="math-container">$\gamma(t)=e^{tX}e^{tY}e^{-tX}e^{-tY}$</span>. Well, it's also the derivative of <span class="math-container">$\gamma(\sqrt{t})$</span>, which proves <strong>closure under the Lie bracket</strong>. </li>
<li>The real justification for the Lie bracket, however, comes from the fundamental fact that <span class="math-container">$\mathrm{ad}:\mathfrak{g}\to\mathrm{Der}(\mathfrak{g}):=X\to[X,\cdot]$</span> is the differential of the adjoint map <span class="math-container">$\mathrm{Ad}:G\to\mathrm{Aut}(G):=g\mapsto\lambda x, gxg^{-1}$</span>, which is a group homomorphism. In particular, the preservation of the Lie Bracket by the differential of a group homomorphism is precisely the <strong>Jacobi identity</strong>: <span class="math-container">$\mathrm{ad}([x,y])=[\mathrm{ad}(x),\mathrm{ad}(y)]$</span>. The basic point is that we are trying to reduce Lie group problems to Lie algebra ones as much as possible, and conjugation is an important idea that we'd like to see the map induced by on the Lie algebra -- we are seeing the result of the obvious fact that <span class="math-container">$T\mathrm{Aut}(G)\subseteq\mathrm{Der}(TG)$</span> (and also <span class="math-container">$T\mathrm{Aut}(M)=\mathrm{Der}(M)$</span> -- the fact that the automorphisms of an object form a group is equivalent to the derivations on an object forming a Lie algebra). Some more examples of the "study the Lie algebra approach":
<ul>
<li>The uniqueness of the determinant as a map from <span class="math-container">$G\to \mathbb{R}-\{0\}$</span>.</li>
<li>An <strong>ideal</strong> is a subalgebra "induced" on the Lie algebra by a normal subgroup of the Lie group. This immediately provides the interpretation as "kernels of Lie algebra homomorphisms" as well as the condition <span class="math-container">$[\mathfrak{g},\mathfrak{i}]\subseteq\mathfrak{i}$</span>. </li>
</ul></li>
<li>The idea behind the manifold-structure of a Lie group is that the flows are produced by left-multiplication by group elements, so those must be homeomorphisms. This motivation can be confirmed through various topological consequences, e.g.
<ul>
<li><strong>A neighbourhood of the identity generates the connected component.</strong> The idea behind the proof is this: if an entire open neighbourhood of the identity is contained in the subgroup, it means you can "flow in any direction" from the subgroup -- but to bring these flows to an arbitrary point of the manifold, you need left-multiplication to be a homeomorphism. </li>
<li><strong>The identity component is a (normal) subgroup.</strong> Because left-multiplication and inversion are continuous, they cannot tear the connected component apart (generalised "intermediate value theorem"), so it is closed under multiplication.</li>
<li><strong>Compact Lie groups</strong> -- How can a Lie group possibly "close in on itself"? Surely we keep "extending" an open neighbourhood <span class="math-container">$W$</span> of the identity by observing that <span class="math-container">$xW$</span> must be in the subgroup? The idea is that these translations of <span class="math-container">$W$</span> form an <strong>open cover of the group, if it has a finite subcover</strong>, then it makes sense for the group to close in on itself. By playing around with different open neighbourhoods <span class="math-container">$W$</span> and taking some suitable unions, one can see that this is equivalent to the condition that every open cover has a finite subcover, i.e. the group is compact.</li>
</ul></li>
<li><strong>Characterisation of Abelian Lie groups</strong> -- "Compact Connected Abelian Lie Group is a torus" is a generalisation of "finite Abelian group is a product of cyclic groups" -- the idea is that the exponential map "wraps" the Lie algebra around into the Lie group -- this just gives the quotient of the Lie algebra by the kernel of the exponential map, which is topologically <span class="math-container">$\mathbb{R}^n/\mathbb{Z}^n$</span>. The characterisation of a connected Abelian Lie group as a cylinder <span class="math-container">$\mathbb{R}^{n+k}/\mathbb{Z}^k$</span> follows similarly.</li>
</ul>
<p>With that said, here are some stuff I <strong>DON'T (completely) understand</strong>, and would like to have a similar level of understanding for:</p>
<ul>
<li>Why is the <strong>structure of a Lie group characterised by its second-order structure</strong>? I know that this follows from the <strong>BCH formula</strong>, the local diffeomorphism nature of the exponential map and the fact that an open neighbourhood of 1 generates the group, but I have no intuition at all why the BCH formula "should" be true.</li>
<li>What's the deal with <strong>simply-connected groups</strong>? I can certainly see why the Lie algebra cannot detect disconnectedness in a group -- I had expected that it could not detect compactness either (whether the group closes in on itself eventually), so the statement of Lie's third theorem would be "every Lie algebra has a corresponding unique connected, compact Lie group". Instead, the statement is "every Lie algebra has a corresponding <em>simply connected</em> Lie group".</li>
<li><strong>Non-surjectivity of the exponential map</strong> even in the connected case -- I'm not asking for counter-examples, I'm asking "what exactly goes wrong in groups like <span class="math-container">$SL_\mathbb{R}(2)$</span>?", perhaps a hint about "what does the image of the exponential map look like?" (as an analogy, I would explain smooth functions failing to be analytic as "they are flatter than every polynomial at 0, and can be constructed as <span class="math-container">$1/f(1/x)$</span> where <span class="math-container">$f$</span> is any function that grows faster than every polynomial")</li>
<li><strong>Surjectivity when every element is contained in a maximal torus</strong> -- I read this <a href="https://en.wikipedia.org/wiki/Maximal_torus#Properties" rel="nofollow noreferrer">here</a> as a generalisation of "the exponential map is surjective in the connected compact case". Even if the generalisation isn't true, is there an intuitive way to understand why compactness makes the problem in the previous point go away.</li>
<li><strong>Characterisation of non-Abelian Lie groups</strong> -- Tell me if my understanding of simple and semisimple Lie algebras makes sense -- we want to classify non-Abelian Lie groups as products like we do Abelian Lie groups, and the only way to do so is as "semidirect products of simple Lie groups and Abelian Lie groups". A <strong>reductive</strong> Lie group is basically when this semidirect product is a direct product, and a <strong>semi-simple</strong> one is a reductive Lie group where there are no Abelian groups in the product. Is this right?</li>
<li><strong>Various abstract algebraic things</strong> -- I have no idea how to interpret things like nilpotent and solvable Lie algebras, radicals and so on in the context of Lie theory. </li>
<li>At first when I heard of the <strong>Killing form</strong>, I presumed it would be some "natural" way to define a dot product on the Lie algebra -- but I honestly don't see how it is natural. Is it the <em>only</em> dot product that is invariant under Lie algebra automorphisms? </li>
</ul>
<p>I've thought very hard about the theory, but I just can't seem to figure out how to fill these "holes". <strong>Am I missing some important central insight into Lie theory that are crucial to some of these questions?</strong></p>lie-groupslie-algebrasintuitionmatrix-exponentialTue, 24 Sep 2019 17:22:34 GMThttps://math.stackexchange.com/q/3368390Abhimanyu Pallavi Sudhir2019-09-24T17:22:34ZLie group topology
https://thewindingnumber.blogspot.com/2019/09/lie-group-topology.html
0I'll assume you have a basic understanding of general topology -- if not, consult the <a href="https://thewindingnumber.blogspot.com/p/2204.html">topology articles here</a>. Most of the abstract stuff and "weird" cases are not really important, because it is easy to see that Lie groups are manifolds.<br /><br />We need to be careful while studying the topology of Lie groups, because we already have an intuitive picture of a Lie group, and we need to be careful to prove all the things we just "believe" to be true.<br /><br />The main point of the topology of a Lie group is that the group elements define the "flows" on the manifold. What this means is that <b>left-multiplication is a homeomorphism</b>, and it's not absurd to say that <b>inversion is a homeomorphism</b>, because it represents a "reflection" of the manifold. That these conditions make sense is confirmed by looking at the proofs of the following "obvious" facts.<br /><br /><b>(1) In a connected group, a neighbourhood of the identity generates the entire group,</b> i.e. $H\le G\land H\in N(1)\implies H=G$ for connected $G$.<br /><br />Let's think about why this is true. Why does $H$ need to be a neighbourhood -- why must it contain an open set containing the identity? Suppose instead we just knew it contained a set $Q$ that looked like this:<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://1.bp.blogspot.com/-GKetHN8XvRw/XYZe0gFDizI/AAAAAAAAFuo/TNWiIWG-kno_P6VXRJvktYb9_SlRGZjtgCLcBGAsYHQ/s1600/not%2Ba%2Bneighbourhood.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="378" data-original-width="346" height="320" src="https://1.bp.blogspot.com/-GKetHN8XvRw/XYZe0gFDizI/AAAAAAAAFuo/TNWiIWG-kno_P6VXRJvktYb9_SlRGZjtgCLcBGAsYHQ/s320/not%2Ba%2Bneighbourhood.png" width="291" /></a></div><br />Well, $H$ still contains the orange point, but we cannot say it contains the purple point, because it's perfectly happy not containing it -- it's not like we have some vertical element in the Lie group that if you multiplied to some point in $Q$, you'd get the purple point. But instead if $Q$ was an open neighbourhood of the identity:<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://1.bp.blogspot.com/-MnEQeC8NSoQ/XYZkr1agIEI/AAAAAAAAFu0/v0TxF35QuT4Ctb1FVdxM4y76Rw4-nzWqgCLcBGAsYHQ/s1600/neighbourhood%2Ba%2Byes.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="393" data-original-width="370" height="320" src="https://1.bp.blogspot.com/-MnEQeC8NSoQ/XYZkr1agIEI/AAAAAAAAFu0/v0TxF35QuT4Ctb1FVdxM4y76Rw4-nzWqgCLcBGAsYHQ/s320/neighbourhood%2Ba%2Byes.png" width="301" /></a></div>Then the purple point has to be in $H$, because $Q$ contains flows in "all directions" on the group. To actually prove that every point will be contained in $H$ -- well, we know that the point is (will eventually be) that $H$ is the connected component of $G$ (and since $G$ is connected, $H=G$) -- let's just show that $H$ is both open and closed, i.e. nothing in $H$ touches its exterior, and nothing in its exterior touches $H$. Here's the proof:<br /><ul><li><b>Nothing in $H$ touches anything -- </b>Suppose $\exists x\in H, x\in\mathrm{cl}(H')$. Then $xQ$ contains a point in $H'$.</li><li><b>Nothing outside $H$ touches it</b> -- Suppose $\exists x\in H', x\in\mathrm{cl}(H)$. Then $xQ$ contains a point in $H$, so $x$ must be in $H$.</li></ul>We're really just formalising the notion of "translating $Q$ to its edges to extend $H$ further and further". The key fact we've used here is, of course, that left-multiplication is a homeomorphism, so $xQ$ is still an open set.<br /><b><br /></b><b>(2) The connected component of the identity is a subgroup.</b><br /><b><br /></b> The idea is that taking two elements $g,h$ of the connected component, their product should remain in the connected component. Once again, this follows from the <b>continuity of left-multiplication</b> -- considering the action of left-multiplication by $g$ on the connected component, its continuity implies that the image must remain connected.<br /><b><br /></b><b>(3) If a subgroup contains a neighbourhood of the identity, it contains the connected component of the identity.</b><br /><div><b><br /></b> Corollary to (1) and (2).<br /><b><br /></b></div><div><b>(4) The connected component of the identity is a <i>normal</i> subgroup.</b><br /><b><br /></b> Conjugation is a continuous map.<br /><b><br /></b><b>(5) Open subgroups are closed.</b><br /><b><br /></b> Corollary to (3). Alternate proof: the complement is the union of some cosets, which are open sets too. A weaker theorem can be made of closed sets -- closed subgroups with finite index are open.<br /><br />What this means: any open subgroup is a union of connected components.<br /><br /><b>(6) Intuition for compact subgroups</b><br /><b><br /></b>How can a Lie group possibly "close in on itself"? Surely we keep "extending" an open neighbourhood $W$ of the identity by observing that $xW$ must be in the subgroup? The idea is that these translations of $W$ form an <b>open cover of the group, if it has a finite subcover</b>, then it makes sense for the group to close in on itself. By playing around with different open neighbourhoods $W$ and taking some suitable unions, one can see that this is equivalent to the condition that every open cover has a finite subcover, i.e. the group is compact.<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://1.bp.blogspot.com/-cHRhxfTndyo/XYcEqmYku_I/AAAAAAAAFvA/S4UjQafMaX81u0p1emQk8-6kgy8yEWbXgCLcBGAsYHQ/s1600/open%2Bcover%2B-%2Bselect.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="393" data-original-width="408" height="308" src="https://1.bp.blogspot.com/-cHRhxfTndyo/XYcEqmYku_I/AAAAAAAAFvA/S4UjQafMaX81u0p1emQk8-6kgy8yEWbXgCLcBGAsYHQ/s320/open%2Bcover%2B-%2Bselect.png" width="320" /></a></div><br /></div><div><b>(7) A compact, connected Abelian Lie group is a torus.</b></div><div><b><br /></b> This is a generalisation of "a finite Abelian group is the direct product of cyclic groups".<br /><br />The idea behind the proof is that in the Abelian case, the exponential map is a homomorphism from the Lie algebra to the Lie group, but the Lie algebra cannot detect compactness in the Lie group -- the kernel of the exponential map can. We know from our study of the exponential map that it has a discrete kernel, and in the Abelian case is surjective -- thus the Lie group is homeomorphic to $\mathbb{R}^n/\mathbb{Z}^n$, which is an $n$-torus.<br /><b><br /></b></div><div><b>(8) A connected Abelian Lie group is a cylinder (direct product of a torus and an affine space)</b><br /><b><br /></b>Analogous to above, except $\mathbb{R}^m/\mathbb{Z}^n$ where $m\ge n$.</div><div></div>compactnessconnected componentconnectednessgroup theorygroupslie group topologylie groupslie theorynormal subgroupopen setstopologyMon, 23 Sep 2019 15:55:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-7539770682824114785Abhimanyu Pallavi Sudhir2019-09-23T15:55:00ZComment by Abhimanyu Pallavi Sudhir on Examples about maximal Abelian subgroup is not a maximal torus in compact connected Lie group $G$.
https://math.stackexchange.com/questions/2045430/examples-about-maximal-abelian-subgroup-is-not-a-maximal-torus-in-compact-connec/2045693#2045693
"maximal abelian subgroup which is not a maximal torus must necessarily have lower dimension than the maximal tori" -- You mean for a compact Lie group, right?Mon, 23 Sep 2019 15:33:50 GMThttps://math.stackexchange.com/questions/2045430/examples-about-maximal-abelian-subgroup-is-not-a-maximal-torus-in-compact-connec/2045693?cid=6928667#2045693Abhimanyu Pallavi Sudhir2019-09-23T15:33:50ZAnswer by Abhimanyu Pallavi Sudhir for How to develop intuition in topology?
https://math.stackexchange.com/questions/576593/how-to-develop-intuition-in-topology/3364031#3364031
0<p>Let's do an example: let's say we want to know when limits are unique in a topological space. Here's the proof of the theorem in a metric space:</p>
<blockquote>
<p>Let <span class="math-container">$(a_n)$</span> be a sequence with limits <span class="math-container">$L_1$</span> and <span class="math-container">$L_2$</span>. Then <span class="math-container">$a_n$</span> is eventually within every neighbourhood of <span class="math-container">$L_1$</span> and every neighbourhood of <span class="math-container">$L_2$</span>. If <span class="math-container">$L_1\ne L_2$</span>, we can choose the neighbourhoods to be disjoint. Contradiction.</p>
</blockquote>
<p>This is completely equivalent to the proof you've probably seen, but I've phrased everything in terms of neighbourhoods, which are fundamentally topological concepts. The only fact we used is the existence of disjoint neighbourhoods of distinct points. Limits being unique is pretty important, so we call a space where distinct points allow disjoint neighbourhoods a <strong>Hausdorff space</strong> or <strong>T2 space</strong>.</p>
<p>(It's also worth thinking about why the generalisation goes for limits of <em>nets</em>, rather than limits of <em>sequences</em>)</p>
<p>The trick I'm suggesting is to "work backwards" from theorems you can tell are important (as opposed to some inane statement about open sets): (1) start with an important theorem in analysis, (2) go through its proof, (3) work out what axioms you need and simplify them to a form involving just open sets.</p>
<p>Some more examples of such generalisations:</p>
<ul>
<li>Every open neighbourhood of a limit point of <span class="math-container">$S$</span> contains an infinite number of points in <span class="math-container">$S$</span>. (T1 space)</li>
<li>Finite sets are closed. (T1 space)</li>
<li>Continuous extension theorem. (T4 space)</li>
<li>Bolzano-Weierstrass theorem (compact sets)</li>
<li>Intermediate value theorem (connected sets)</li>
</ul>
<p>You may find this series of articles I wrote illuminating to this end: <a href="https://thewindingnumber.blogspot.com/p/2204.html" rel="nofollow noreferrer">https://thewindingnumber.blogspot.com/p/2204.html</a></p>Sat, 21 Sep 2019 04:32:49 GMThttps://math.stackexchange.com/questions/576593/-/3364031#3364031Abhimanyu Pallavi Sudhir2019-09-21T04:32:49ZComment by Abhimanyu Pallavi Sudhir on A compact, connected, abelian Lie group is a torus?
https://math.stackexchange.com/questions/1396822/a-compact-connected-abelian-lie-group-is-a-torus/2076198#2076198
@Hawk $\mathrm{exp}$ is a homeomorphism at the identity, so there is a neighbourhood of 0 that doesn't contain any other kernel points. Suitable left-multiplications show it is discrete everywhere.Fri, 20 Sep 2019 07:07:51 GMThttps://math.stackexchange.com/questions/1396822/a-compact-connected-abelian-lie-group-is-a-torus/2076198?cid=6921320#2076198Abhimanyu Pallavi Sudhir2019-09-20T07:07:51ZMixed states II: decoherence; important measures of purity and entropy
https://thewindingnumber.blogspot.com/2019/09/mixed-states-ii-decoherence-important.html
0<b>Decoherence</b><br /><br />At the end of this section, you should be able to:<br /><ul><li>appreciate why the density matrix is really a great way of expressing states, even for pure states (they uniquely determine the dynamics of the system, without any "overall phase", etc.)</li><li>develop an intuition for measurement, even "inadvertent" measurement</li><li>understand on a somewhat high level how classical physics arises as a limit of quantum physics</li><li>hang out with Wigner's friend</li><li>admit that complex phases matter in quantum mechanics and link them to interference</li></ul><br />Let's talk about <b>measurement</b>.<br /><br />Suppose we have a system that we wish to measure it under an operator whose eigenvectors are $|0\rangle_A$ and $|1\rangle_B$. The idea is that we have some measurement apparatus, and their original combined state evolves from something like:<br /><br />$$|\psi\rangle_{AB}=(\lambda|0\rangle_A+\mu|1\rangle_B)\otimes|0\rangle_B$$<br />To the entangled state:<br /><br />$$|\psi\rangle_{AB} = \lambda|0\rangle_A\otimes|0\rangle_B+\mu|1\rangle_A\otimes|1\rangle_B$$<br />Then observing the apparatus is sufficient to observe the system. The idea is that ultimately, the observer himself (or his "knowledge") are the apparatus, and the he entangles with the system to measure it.<br /><br />Well, we know that often, we end up seeing things we didn't really want to. After all, physics does not care about your wants and preferences. In fact, in pretty much any situation, information about the system <i>will</i> <b>leak</b> out into the surroundings in some specific way. For example, Schrodinger's cat leaks information about the life of the cat by making the environment smelly, i.e. the state evolves from:<br /><br />$$|\psi\rangle_{AB}=(\lambda|\mathrm{alive}\rangle+\mu|\mathrm{dead}\rangle)\otimes|\mathrm{clean}\rangle$$<br />To the entangled state:<br /><br />$$|\psi\rangle_{AB}=\lambda|\mathrm{alive}\rangle\otimes|\mathrm{clean}\rangle+\mu|\mathrm{dead}\rangle\otimes|\mathrm{smelly}\rangle$$<br />What this means is that the density matrix of the cat evolves as:<br /><br />$$\left[ {\begin{array}{*{20}{c}}{{{\left| \lambda \right|}^2}}&{\lambda \bar \mu }\\{\mu \bar \lambda }&{{{\left| \mu \right|}^2}}\end{array}} \right] \mapsto \left[ {\begin{array}{*{20}{c}}{{{\left| \lambda \right|}^2}}&0\\0&{{{\left| \mu \right|}^2}}\end{array}} \right]$$<br />(Check that I got the right transpose.) OK, what happened here?<br /><br />Recall that the probabilities of collapsing to $|0\rangle$ and $|1\rangle$ are determined purely by the elements on the diagonal -- the off-diagonal elements, or the <b>coherences</b>, are only relevant for collapsing on to some combination of $|0\rangle$ and $|1\rangle$. What's going on here is that when the environment entangles with the system, it has "kinda" already observed it -- like your Wigner's friend. It "knows" that the system isn't in $|0\rangle+|1\rangle$, and even though you haven't observed the environment yet (you haven't smelled it), you know how the combined state has evolved, and the probability has become a <b>classical probability</b>, because the quantum stuff has already been observed -- by the environment.<br /><br /><b>The idea behind decoherence is the same idea that ensures that the Wigner's friend scenario is consistent.</b><br /><b><br /></b> "Eventually", "all" the information about the system will leak into the environment -- i.e. in principle, we should be able to determine anything about the system from measuring the environment, and our uncertainty about the system arises entirely from our <b>completely classical uncertainty</b> about the environment -- so the density matrix becomes a classical one, i.e. a <b>diagonal one</b> (the off-diagonal terms go to zero).<br /><br />What basis is it diagonal in? In the basis corresponding to the states of the environment -- i.e. if the environment can be in states $|0\rangle_B$ and $|1\rangle_B$, then the states of the system that precisely induce these states of the environment form the preferred basis. These are often called the "<b>environmentally selected basis</b>".<br /><br />This process is called <b>decoherence</b>. You may also hear the terms <b>pointer states</b> (for the preferred basis), <b>einselection</b> (<i>environmentally induced selection</i> of the preferred basis), or <b>Quantum Darwinism</b> (what the heck?) -- but they're really synonymous. We'll just use the fancy words when they're grammatically useful.<br /><br />Well, the following may not be completely clear, but you should at least be able to appreciate that it is true: the off-diagonal terms <i>approach</i> zero, rather than hit it. Why? Although the system leaks information into the surroundings, we aren't really certain about what we're inferring about the system from the environment -- a live cat may be smelly too, etc. So the pointer states are not exactly orthogonal, either.<br /><br />The precise behavior of decoherence depends on the Hamiltonian of the system -- e.g. predicting the generation of the smelliness of the air from the state of the cat based on what's going on microscopically is something that could be done in principle by solving a really complicated Schrodinger equation. You can, given a Hamiltonian, at least make order-of-magnitude estimates of at how much time and at how macroscopic a scale (i.e. with how many degrees of freedom) does the system begin to behave in a way that can be described as classical.<br /><br /><div class="twn-pitfall">Decoherence does <em>not</em> remove the need for wavefunction collapse -- one still needs the observer to note an observation, collapsing the system.</div><br />TBC: purity, entropy, correlation functionsdecoherencedensity matrixphysicsquantum mechanicstensor productwigner's friendMon, 16 Sep 2019 18:47:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-415557332515658087Abhimanyu Pallavi Sudhir2019-09-16T18:47:00ZLie group homomorphisms
https://thewindingnumber.blogspot.com/2019/09/lie-group-homomorphisms.html
0Because a Lie group is fundamentally a group that is also a manifold, we'd like to define a Lie group homomorphism as one that is both a <i>group homomorphism</i>, and <i>smooth</i>. For this, though, we need to define what it means to differentiate a group homomorphism.<br /><br />Recall that the general notion of a derivative is the idea of "how does the map work locally"? Letting a general function $f:G\to H$ map a curve $\gamma(t)$, it should be easy to see that $\gamma'(t)$ transforms as $(f\circ\gamma)'(t)$ (make sure that this makes sense -- think in terms of the chain rule, or write it out in limit form, or just in terms of the image of the curve).<br /><br />Consequently this leads to the <b>differential </b>$df:dG\to dH$ (where $dG$ is the Lie Algebra of $G$) defined as $df(\gamma'(0))=(f\circ\gamma)'(0)$. Some short exercises:<br /><ul><li>Confirm that this is equivalent to saying that $df(X)$ is the <b>directional derivative</b> of $f$ in the $X$ direction.</li><li>Differentiate $f(xyx^{-1}y^{-1})$ with respect to $x$ in the $X$ direction at $x=1$ (hint: this is a direct application of the definition of the differential in reverse).</li><li>Convince yourself that any derivative operator commutes with $df$, i.e. $D(df(X))=df(D(X))$.</li></ul>It should be intuitively clear that if $f$ is a homomorphism, its local effect should be to act as a homomorphism of the Lie algebra as it should preserve all local structure. We can easily show that:<br /><ol><li>Since $df$ is a derivative of $f$, its value must be a <b>linear map</b> (like the Jacobian). This applies to the derivative as an operator on the tangent space of any manifold -- $f$ doesn't need to be a group homomorphism at all.</li><li>It <b>preserves the Lie bracket</b>. Take $f(xyx^{-1}y^{-1})=f(x)f(y)f(x)^{-1}f(y)^{-1}$ and differentiate it once with respect to $x$ in the $X$ direction at $x=1$, obtaining: $df(X-yXy^{-1})=df(X)-f(y)df(X)f(y)^{-1}$, simplify and differentiate it with respect to $y$ in the $Y$ direction at $y=1$ to get: $df([Y,X])=[df(Y),df(X)]$.</li></ol><hr /><br /><b>The adjoint map</b><br /><br />The Lie Bracket $[Y,X]$ is <em>not</em> the derivative of conjugation $gxg^{-1}$, so you don't have to worry -- the Lie Bracket is not a Lie algebra homomorphism (it doesn't preserve Lie Brackets), the derivative of conjugation at the identity is zero. That's unfortunate -- our <a href="https://thewindingnumber.blogspot.com/2019/06/derivations-and-jacobi-identity.html">explanation of the Jacobi identity</a> ("a derivation acts through the Lie Bracket as a derivation on the space of derivations where multiplication is given by the Lie Bracket") really indicated that it has something to do with it.<br /><br />The Lie Bracket <em>is</em> the derivative of conjugation $xgx^{-1}$. OK, so?<br /><br />Here's the idea: $\mathrm{Ad}(x)(y)=xyx^{-1}$ defines a homomorphism $\mathrm{Ad}:G\to\mathrm{Aut}(G)$. Its differential $\mathrm{ad}:dG\to d\mathrm{Aut}(G)$ can be confirmed to be the Lie Bracket $\mathrm{ad}(X)(Y)=[X,Y]$. So preservation of the Lie Bracket means:<br /><br />$$\mathrm{ad}([X,Y])=[\mathrm{ad}(X),\mathrm{ad}(Y)]$$<br />This is <b>precisely the Jacobi identity</b>! So the Lie bracket <i>is</i> a Lie algebra homomorphism, from a Lie algebra to the Lie algebra of half-filled Lie brackets.<br /><br />There is indeed a relationship between this "homomorphism" understanding of the Jacobi identity and the "derivation" understanding. In general, given a curve $\phi:\mathbb{R}\to\mathrm{Aut}(G)$, differentiating $\phi(t)(gh)=\phi(t)(g)\phi(t)(h)$ at $t=0$ we see that its derivative $d\phi$ satisfies the product rule, i.e. is a derivation (in fact this is true even when $G$ is not a group -- often a Lie group arises this way, as the automorphism group of some object and these derivations then form its Lie algebra). This implies<br /><br />$$d\mathrm{Aut}(G)\subseteq\mathrm{Der}(dG)$$<br />So $[X,\cdot]$ is a derivation, and the map from $X$ to $[X,\cdot]$ is a Lie algebra homomorphism $dG\to\mathrm{Der}(dG)$. This really does give us a much more general way to look at everything we talked about in the <a href="https://thewindingnumber.blogspot.com/2019/06/derivations-and-jacobi-identity.html">last article</a>.<br /><br /><div class="twn-pitfall">Wait -- shouldn't it be an equality? <a href="https://thewindingnumber.blogspot.com/2019/06/derivations-and-jacobi-identity.html">I thought all derivations were part of the Lie Algebra</a>? Ah, but there the derivations on $M$ formed the Lie Algebra of $\mathrm{Aut}(M)$, i.e. $d\mathrm{Aut}(M)=\mathrm{Der}(M)$. So indeed $d\mathrm{Aut}(dG)=\mathrm{Der}(dG)$. This makes sense, indeed $\mathrm{Aut}(G)\subseteq \mathrm{Aut}(dG)$. It's interesting to think about when it is that the Lie algebra has "more" automorphisms than the Lie group.</div><br /><div class="twn-furtherinsight">One may wonder if all automorphisms of a group are a conjugation by something -- or equivalently, if all automorphisms of a Lie algebra are a derivation of some kind. We will later see a special classification of Lie group for which this is true -- in general, the conjugation automorphisms are called the <b>innner automorphisms</b> of the group and are denoted as $\mathrm{Inn}(G)$. The group of <em>all</em> endomorphisms (invertible linear transformations $dG\to dG$) of a Lie algebra, meanwhile are denoted as $\mathrm{End}(dG)$, and it's easy to see that this occurs iff the Lie algebra is Abelian.</div><br /><b>Exercise:</b> Show that the map $\mathrm{Ad}:G\to \mathrm{Aut}(G)$ is injective iff $G$ has a trivial center.<br /><br />So if $G$ has trivial center and all its automorphisms are inner, it is isomorphic to $\mathrm{Aut}(G)$ and is called <b>complete</b>.<br /><br /><hr /><br /><b>The determinant map</b><br /><b><br /></b> The determinant is a homomorphism $\det:GL_F(n)\to F$ from any matrix group. The first thing we'd like to do with this is find its differential $\det'$ (which will be an $F$-valued function on $M_F(n)$). By definition of the differential:<br /><br />$$\det' A = \lim_{\varepsilon\to 0}\frac{\det (I+\varepsilon A)-1}{\varepsilon}$$<br />It's easy to prove by writing out the entries of the matrix as $\delta_{ij}+\lambda_{ij}\varepsilon$ and performing induction on the dimension of the matrix that this is equivalent to:<br /><br />$$\det'A=\mathrm{tr} A$$<br /><hr /><br /><b>Lie algebra homomorphisms in detail: ideals</b><br /><b><br /></b>Well, Lie algebra homomorphisms are a specific category of vector space homomorphisms, aren't they? It's not enough that they preserve the linear structure, they must preserve the Lie bracket too. Well, let's study them in more detail -- like a crash course through linear algebra, but with Lie algebra instead.<br /><br />What does the kernel of a Lie algebra homomorphism $A$ look like? Well, because the homomorphism preserves linear combinations, the kernel must be a linear subspace -- similarly because the homomorphism preserves the Lie bracket, we must have that $Av=0\implies \forall w\in\mathfrak{g}, A[v,w]=0$, i.e. the kernel must be closed under derivations from $\mathfrak{g}$: $[\mathfrak{g},\mathfrak{i}]\subseteq\mathfrak{i}$. Such a subalgebra is called an <b>ideal</b>.<br /><br /><b>Exercise:</b> Show that the Lie algebra of a normal subgroup is an ideal (careful -- it's not as obvious as you might think -- but still pretty obvious).adjoint mapconjugationderivationsdeterminanthomomorphismidealjacobi identitylie algebralie bracketlie groupslie theorymathematicsnormal subgroupFri, 13 Sep 2019 19:36:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-2975693346347530459Abhimanyu Pallavi Sudhir2019-09-13T19:36:00ZAnswer by Abhimanyu Pallavi Sudhir for Adjoint map is Lie homomorphism
https://math.stackexchange.com/questions/1339289/adjoint-map-is-lie-homomorphism/3355482#3355482
0<p><span class="math-container">$\mathrm{ad}_X$</span> is not a Lie Homomorphism, but <span class="math-container">$\mathrm{ad}$</span> is. We can define a map <span class="math-container">$\mathrm{Ad}:G\to\mathrm{Aut}(G):=\lambda x.\lambda y.\ xyx^{-1}$</span>, whose differential is then <span class="math-container">$\mathrm{ad}:TG\to T\mathrm{Aut}(G):=\lambda X.\lambda Y.\ [X,Y]$</span>. The homomorphism property on this map is then precisely the Jacobi identity.</p>Fri, 13 Sep 2019 17:49:45 GMThttps://math.stackexchange.com/questions/1339289/-/3355482#3355482Abhimanyu Pallavi Sudhir2019-09-13T17:49:45ZTime evolution, Schrodinger and Heisenberg pictures, Noether's theorem
https://thewindingnumber.blogspot.com/2019/09/time-evolution-schrodinger-equation.html
0So far, we have discussed quantum mechanics without any reference to changes across time. You might think we could just upgrade $\psi(x)$ to $\psi(x,t)$ and e.g. an observable $Q_t$ measuring a value $q$ at time $t$ would have eigenvectors whose cross-section at $t$ are of the form $\delta(x-a)$. But this would mean the entire $\psi(x,t)$ is the state of the object, rather than there being a state at each value of $t$, and time would be an observable. This is clearly not what we want (right now -- although to be consistent with special relativity we will need to treat space and time on an equal footing later in this series).<br /><br />Instead, a more appropriate approach is to say that the state is a function of time $|\psi(t)\rangle$ and the evolution of the state is given by some operation $|\psi(t)\rangle=U[|\psi(0)\rangle]$. <br /><br />How do we know that $U$ is a <b>linear operator</b>? What does it mean for $U$ to be a linear operator anyway? The only sense in which such a linearity can be tested is by looking at a state in a superposition. So suppose $|\psi(0)\rangle=|\psi_1(0)\rangle+|\psi_2(0)\rangle$. Now $|\psi(t)\rangle=U[|\psi_1(0)\rangle+|\psi_2(0)\rangle]$ -- this is from the perspective of some observer Alice.<br /><br />But if another observer Bob had previously observed and collapsed the system to $|\psi_1(0)\rangle$ at time 0, then according to him, the state should evolve to $U[|\psi_1(0)\rangle]$, and if he had observed the system in $|\psi_2(0)\rangle$, his knowledge of the system would evolve to $U[|\psi_2(0)\rangle]$.<br /><br />So according to Alice, who doesn't know what Bob has observed (she has not observed him), her knowledge of the system can also be written as $U[|\psi_1(0)\rangle]+U[|\psi_2(0)\rangle]$. Thus<br /><br />$$U[|\psi_1(0)\rangle+|\psi_2(0)\rangle]=U[|\psi_1(0)\rangle]+U[|\psi_2(0)\rangle]$$<br />I.e. $U$ is linear, so we can write it as a linear operator as in $U|\psi(0)\rangle$. (The above scenario is called <b>Wigner's friend</b>)<br /><br />$U$ is also clearly a <b>unitary operator</b>, as it must preserve all lengths.<br /><br />We can consider infinitesimal time evolutions $U_t(dt)$ representing evolution of the state from $t$ to $t+dt$. Then:<br /><br />$$|\psi(t)\rangle=U_0(dt)\dots U_{t-dt}(dt)|\psi(0)\rangle$$<br />This <a href="https://thewindingnumber.blogspot.com/p/2202.html">product integral</a> can be written alternatively as:<br /><br />$$|\psi(t)\rangle=\mathcal{T}\left\{e^{\int \ln U_t(dt)}|\psi(0)\rangle\right\}$$<br /><div class="twn-furtherinsight">$\mathcal{T}$ is the <b>time-ordering operator</b> which orders a product like $H(t_1)H(t_2)$ in order of ascending $t$ in an expansion. Can you see why this is necessary (hint: $e^{AB}\ne e^{A+B}$ for noncommuting $A,B$).</div><br /><div class="twn-pitfall">Oh, and it's not actually an operator -- not even in the math sense, it's a "formal operation", one that takes a <em>form</em> or sentence (rather than its value) -- in this case the $\exp$ Taylor expansion -- and changes it some way.</div><br />$\ln U_t(dt)$ is an infinitesimal, and it's easy to see that it is equal to $U_{t}'(0)dt$ -- a member of the Lie algebra. We know, of course, that the Lie Algebra of the unitary group is comprised of <b>anti-Hermitian operators</b> (this can be checked without Lie Algebra, of course), and so $iU_{t}'(0)$ is Hermitian. From Lie Algebra, we can tell that this represents a <b>generator of time translations</b> -- and from a little experience of classical mechanics, we want this to represent energy. So for dimensional consistency with energy, we write:<br /><br />$$H(t)=i\hbar U_{t}'(0)$$<br />(Why $\hbar$ and not $h$? Because $U_{t}'(0)$ is basically already in "radians per second".) This is called the <b>Hamiltonian operator</b>. It determines $U_t$, and thus describes the time evolution of a state. How exactly? Since:<br /><br />$$|\psi(t+dt)\rangle=U_t(dt)|\psi(t)\rangle$$<br />We can write re-arranging:<br /><br />$$\frac{\partial |\psi(t)\rangle}{\partial t}=-\frac{i}{\hbar}H(t)|\psi(t)\rangle$$<br />This is the most general form of the <b>Schrodinger equation</b>. Note that the earlier exponential equation is the "general solution" to this equation -- obviously not very useful, rewritten as what is known as the <b>Dyson series</b>:<br /><br />$$|\psi(t)\rangle=\mathcal{T}\left\{e^{-i/\hbar\int H(t) dt}|\psi(0)\rangle\right\}$$<br />It's easy to show from this that the evolution of a density matrix $\rho(t)$ is similarly:<br /><br />$$\frac{\partial\rho(t)}{\partial t}=-\frac{i}\hbar [H, \rho]$$<br />Which is the <b>von Neumann equation</b>, whose solution is given by:<br /><br />$$\rho(t)=\mathcal{T}\left\{e^{-i/\hbar\int H(t) dt}\rho(0)e^{i/\hbar \int H(t) dt}\right\}$$<br />These should all appear as obvious special cases of Lie theoretic results.<br /><br /><hr /><br /><div class="twn-pitfall">$H(t)$ is <em>not</em> the same as $i\hbar\partial/\partial t$. $H(t)$ is a Hermitian operator, i.e. an observable, while $\partial/\partial t$ does not act on the Hilbert space at all. One could also see what could wrong by equating the two in the "solution to the Schrodinger equation" above. The Schrodinger equation does not say that $H$ and the time-derivative are equal in general -- rather, it says that they are the same <i>on a valid state vector</i> $|\psi(t)\rangle$ -- you cannot just "factor this out".</div><br />So the Hamiltonian is fundamentally what determines the dynamics of a quantum system. Give me a Hamiltonian, and you've given me a theory. The Schrodinger equation (or equivalently the von Neumann equation) above is just an axiom of quantum mechanics/of any quantum mechanical theory.<br /><br /><hr /><br />Can we talk about the velocity and acceleration observables for a moment? Actually, we can't, because they fundamentally have to do with time evolution, and we can't have observables that depend on the time-evolution of the state -- observables must act on the Hilbert space. But we can define observables that predict how the state will evolve (like the Hamiltonian with the Schrodinger equation).<br /><br />Doing this systematically is where the Heisenberg picture comes in.<br /><br />What does this mean? Everything we've discussed so far is the <b>Schrodinger picture</b>, where the state evolves on a fixed background basis created by the observables' eigenvectors -- so observables represent <b>active transformations</b>. Instead, we can have a completely different picture of reality, the <b>Heisenberg picture</b>, where we view time-evolution as simply viewing the state in a different basis -- then the observables represent <b>passive transformations</b>.<br /><br />OK, so how do we do this? Remember how every question in quantum mechanics can fundamentally be asked in terms of expectation values (specifically those of Hermitian projections). The expected value of an observable at time $t$ of course evolves as:<br /><br />$$\langle A\rangle(t) = \langle\psi|U^*(t)AU(t)|\psi\rangle$$<br />In the Schrodinger picture, we attach the $U(t)$ to $|\psi(0)\rangle$ to make $\langle A\rangle(t)=\langle\psi(t)|A|\psi(t)\rangle$. In the Heisenberg picture instead, we attach the $U(t)$ to the $A(0)$, writing $\langle A\rangle(t)=\langle\psi|A(t)|\psi\rangle$.<br />From differentiating conjugation in $A(t)=U^*(t)A(0)U(t)$, we get:<br /><br />$$\frac{dA}{dt}=\frac{i}\hbar [H, A]$$<br />This is the <b>Heisenberg equation</b>. Immediately, it yields:<br /><br />$$\begin{array}{l}\frac{{dX}}{{dt}} = \frac{i}{\hbar }\left[ {H,X} \right]\\\frac{{{d^2}X}}{{d{t^2}}} = - \frac{1}{{{\hbar ^2}}}\left[ {H,\left[ {H,X} \right]} \right]\end{array}$$<br />Thinking of the evolution of $X$ as a translation of the co-ordinate system, etc., what this does is give us two conditions on what the Hamiltonian should look like for a "Euclidean" system:<br /><br />$$\begin{array}{l}\left[ {H,X} \right] = - \frac{{i\hbar }}{m}P\\\left[ {H,\left[ {H,X} \right]} \right] = \frac{{{\hbar ^2}}}{m}U'(x)\end{array}$$<br />This gives us yet another strong reason (besides the fact that the Hamiltonian generates time-translations, that the "eigenvectors" of $\partial/\partial t$ are the energy states by the de Broglie theorem (but not really), etc.) to suspect that the Hamiltonian represents the <i>energy</i> of the system. Indeed if we use:<br /><br />$$H=\frac1{2m}P^2+U(x)$$<br />We can confirm those conditions above. Well, this is certainly not the only Hamiltonian compatible with classical mechanics, so at this point, I'll just say that this is confirmed by experiment, and is an axiom of the quantum theory of Euclidean systems.<br /><br /><b>Exercise:</b> By taking expectation values in the Heisenberg equation, show that $m\frac{d}{dt}\langle x\rangle =\langle p\rangle$ and $\frac{d}{dt}\langle p\rangle = -\langle U'(x)\rangle$ under the Euclidean Hamiltonian. This is called the <b>Ehrenfest theorem</b>.<br /><br /><hr /><br />I'll discuss one final application of the Heisenberg formalism: it makes <b>Noether's theorem</b> completely trivial.<br /><br />Indeed, $dA/dt=0$ iff $[H,A]=0$ iff $\forall\tau, H=e^{-i/\hbar A\tau}He^{i/\hbar A \tau}$. Then $A$ is a conserved quantity and conjugation with it as an infinitesimal generator represents a symmetry of the Hamiltonian.dyson serieshamiltonianheisenberg picturenoether's theoremquantum mechanicsschrodinger's equationwigner's friendTue, 10 Sep 2019 18:40:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-6078024195458400205Abhimanyu Pallavi Sudhir2019-09-10T18:40:00ZSupplementary definitions: levels of abstraction
https://thewindingnumber.blogspot.com/2019/09/supplementary-definitions-levels-of.html
0Although we have been able to generalise a lot of our theorems about the topological properties of metric spaces to topological spaces, several others remain out of our reach, and we need to find the right level of abstraction that makes them true -- and we can usually do this by trying to prove the theorem and seeing what it "needs", then if this theorem actually characterises the space, proving the need is equivalent to the theorem.<br /><br /><b>Separation axioms</b><br /><i><br /></i><a href="https://thewindingnumber.blogspot.com/2019/08/topology-iii-example-topologies.html">We've seen the first one.</a> TFAE:<br /><ul><li>$X$ is a T0/Kolmogorov/distinguishable space.</li><li>All points in $X$ are distinguishable.</li><li>For any pair of points, there exists an open set containing exactly one of them.</li></ul>Now, in metric spaces, statements like "there is a point of a set (or sequence) $S$ in every open neighbourhood of $x$" are equivalent to "there are an infinite number of points of $S$ in every open neighbourhood of $x$", i.e. <b>every open neighbourhood of a limit point of $S$ has infinite points in $S$</b>. The key here is that at any step in the process, an open neighbourhood of $x$ can be constructed that does not contain any of the previous points in the sequence, i.e. there exists an open neighbourhood of $x$ not containing some finite number of points.<br /><br />This is possible, e.g. if we decide that <b>finite sets are closed</b> (or equivalently <b>singletons are closed</b>). This is an iff -- if a finite set $S$ isn't closed, there are points in $\mathrm{cl}(S)$ that aren't in $S$, but it's impossible for there to be an infinite number of points in $S$ in a neighbourhood of such a point, as $S$ is finite.<br /><br /><div class="twn-pitfall">Wait, what about for points in $S$ itself -- aren't these all also limit points of $S$? We sneakily changed the definition of limit points from $x\in\mathrm{cl}(S)$ to $x\in\mathrm{cl}(S-\{x\})$, i.e. to exclude sequences that include the point itself. So since excluding a set from a finite set keeps it finite, it is still closed, and thus $x$ is not re-included when you close it (so it is not a limit point at all).</div><br /><div class="twn-pitfall">Wait, what about the discrete topology on a finite set? All finite sets are closed, but how could we have an infinite number of anything? The key is that there are no "limit points" of any set. So the only T1 finite spaces are the discrete ones.</div><br />Note how this implies a stronger form of T0 -- T0 said that for every pair of points, at least one of them has an open neighbourhood excluding the other. Well, in a T1 space, we can always just remove any finite number of points from an open neighbourhood and it will remain an open neighbourhood -- so in particular, this means that for every pair of points, <b>each has an open neighbourhood excluding the other</b>, or <b>any two points are separated</b>. In fact, this is clearly an iff (take some finite intersections).<br /><br />The closedness of finite sets has plenty of other implications that match our intuition. You will see some in the list of exercises.<br /><br />So for now, TFAE:<br /><ul><li>$X$ is a T1/Frechet space.</li><li>All points in $X$ are separated.</li><li>For any pair of points, each has an open neighbourhood excluding the other.</li><li>Each open neighbourhood of a limit point of $S\subseteq X$ contains an infinite number of members of $S$.</li><li>Finite sets are closed/singletons are closed.</li></ul>We've seen spaces in which limits are not unique. You might think that topological distinguishability should suffice to have unique limits -- is this right?<br /><br />Let's go through the proof of the uniqueness of limits on metric spaces. Suppose $f$ has two limits $L_1$ and $L_2$ at $c$. Now does it really suffice that there is a neighbourhood of $L_1$ not containing $L_2$? Not necessarily -- it may be the case that this neighbourhood of $L_1$ intersects every neighbourhood of $L_2$ because there's just that amount of indiscretion around $L_2$. What's important is that there exist disjoint neighbourhoods of $L_1$ and $L_2$.<br /><br />So the <b>uniqueness of limits</b> is equivalent to the existence of <b>disjoint neighbourhoods</b> for distinct points. This is variously called a Hausdorff space or a T2 space. So TFAE:<br /><ul><li>$X$ is a T2/Hausdorff space.</li><li>All points in $X$ are separated by neighbourhoods.</li><li>Any pair of points has a pair of respective neighbourhoods disjoint from each other.</li><li>Limits of nets/filters are unique.</li></ul>Here's another obvious fact about the topology of metric spaces: any continuous function $f:S\to\mathbb{R}$ on a closed (not just compact) set $S$ may be <b>extended to a continuous function</b> $F:X\to\mathbb{R}$. The proof of this relies on the ability to "connect" points between the pieces of $S$:<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://1.bp.blogspot.com/-4AjUE-rMf7w/XW6U-Y-8NpI/AAAAAAAAFsc/lUlqnRsFsTozNnVVnjSpLmYZOr0vB4xnQCLcBGAs/s1600/tietze.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="583" data-original-width="887" height="210" src="https://1.bp.blogspot.com/-4AjUE-rMf7w/XW6U-Y-8NpI/AAAAAAAAFsc/lUlqnRsFsTozNnVVnjSpLmYZOr0vB4xnQCLcBGAs/s320/tietze.png" width="320" /></a></div><br />You might think that this just means "having a continuous function between two points", but this is only because of the choice of a one-dimensional space -- in general, the "boundary" of each connected component is not just a finite number of points. What does suffice is that for disjoint closed sets $S$ and $T$ there <b>exists a continuous function $h:X\to\mathbb{R}$ "separating" them</b>, i.e. such that $h(S)=\{0\}$, $h(T)=\{1\}$. Then a bunch of such functions can just be added to $f$ to obtain $F$. This is trivally an iff.<br /><br />One may try further to "prove" the existence of such a continuous function $h$. Well, on $\mathbb{R}$, we have a notion of a "midpoint" between the endpoints of the closed sets -- we could have the value of $h$ at this point be $1/2$, and then continue the construction for all "Dyadic rational"-points.<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://1.bp.blogspot.com/-I7rwx7Ij0Rk/XW6jNqxfszI/AAAAAAAAFso/ntO_QJ1rhHoW4p1KmyIjvSrIO3fq47nQgCLcBGAs/s1600/tietze-urysohn.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="583" data-original-width="887" height="210" src="https://1.bp.blogspot.com/-I7rwx7Ij0Rk/XW6jNqxfszI/AAAAAAAAFso/ntO_QJ1rhHoW4p1KmyIjvSrIO3fq47nQgCLcBGAs/s320/tietze-urysohn.png" width="320" /></a></div><br /><br />In general, we don't really need the notion of a midpoint, but we do need some point with "space around it", i.e. open neighbourhoods of each closed set that don't contain the point. We only need the case where $S$ and $T$ are in a connected component, so this is equivalent to the <b>existence of open neighbourhoods of disjoint closed sets</b>. This is also clearly an iff, as one may consider preimages of open sets in $[0,1]$. Thus TFAE:<br /><ul><li>$X$ is a T4/normal space.</li><li>Every pair of disjoint closed sets has respective disjoint open neighbourhoods. </li><li>Every pair of disjoint closed sets can be separated by a continuous function (<i>Urysohn's lemma</i>).</li><li>Every continuous map on a closed subset can be extended to a continuous map on $X$ (<i>Tietze's extension theorem</i>).</li></ul><b><br /></b> <b>Countability</b><br /><b><br /></b> <a href="https://thewindingnumber.blogspot.com/2019/08/topology-ii-kuratowski-closure-topology.html">We've already seen the first one.</a> TFAE:<br /><ul><li>$X$ is a first-countable space.</li><li>Every point in $X$ has a countable neighbourhood basis.</li><li>$\lim_{x\to a}f(x)=L\iff\forall (x_{n\in\mathbb{N}})\to a, f(x_n)\to f(a)$. </li><li>Every filter has a corresponding sequence.</li></ul>Several other cardinality functions can be defined and other axioms of countability can be introduced, e.g. sequantial spaces, separable spaces and second-countable spaces, but we cannot really motivate them at this point. (Second-countability in particular important when dealing with metrisability, integration and related issues, as it gives rise to the notion of "paracompactness".)<br /><br /><div class="twn-exercises">For the following statements, find the "least restrictive" level of abstraction necessary, and decide if the property characterises the level (i.e. is it an iff?):<br /><ol><li>Every subset $S$ is the union of all the closed sets contained in $S$.</li><li>Every subset $S$ is the intersection of all the open sets containing $S$.</li><li>Every nonempty open set contains a nonempty closed set.</li><li>Every point $x$ admits a nested neighbourhood basis, i.e. a neighbourhood basis totally ordered by $\subseteq$.</li><li>Closed subset $T$ of a compact space $S$ is compact.</li><li>Compact space is closed.</li><li>There is no smallest neighbourhood of any point (i.e. $\forall x\in X,\forall N\in N(x), \exists M\in N(x), N\not\subseteq M$).</li></ol><b>Solutions:</b><br /><ol><li>T1. This is equivalent to asking that every point in $S$ is contained in a closed subset of $S$, which is possible if the singleton is closed. For equivalence, consider a singleton $S$.</li><li>T1. Dual to Exercise 1.</li><li>T1 suffices, but is not equivalent (e.g. the indiscrete topology or any other topology full of clopen sets). Maybe if you change to proper containment.</li><li>First-countability. (both sides of equivalence are clear -- think directed set, etc.)</li><li>Any topological space suffices. We need to be careful with the proof here because the notions of closedness and compactness are intuitively very similar. Here it is: any net on $T$ is a net on $S$ and thus has a convergent subnet with limit is in $S$ -- but its limit must also be in $T$ because $T$ is closed.</li><li>Hausdorff suffices (probably not equivalent, but who cares). The thing is that in general, even on a compact space, although a convergent sequence may have a convergent subsequence in $S$, it may have other limits outside $S$. But in a Hausdorff space, limits are unique.</li><li>"Not Alexandrov". Let's characterise the topologies that <i>do</i> have smallest neighbourhoods -- these are necessarily the ones for which arbitrary intersections of open sets are open, and are called <i>finitely-generated topologies</i> or <i>Alexandrov topologies</i>. </li></ol></div>abstract mathematicsalexandrov topologyfirst-countable spaceshausdorff spacesmathematicsseparation axiomstopological indistinguishabilitytopologyFri, 06 Sep 2019 09:56:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-4477522526150699500Abhimanyu Pallavi Sudhir2019-09-06T09:56:00ZAnswer by Abhimanyu Pallavi Sudhir for A subset of a compact set is compact?
https://math.stackexchange.com/questions/212181/a-subset-of-a-compact-set-is-compact/3346082#3346082
0<p>Here's an alternate proof (for closed subsets, obviously): any net on <span class="math-container">$S$</span> is a net on <span class="math-container">$T$</span> and thus has a convergent subnet with limit is in <span class="math-container">$T$</span> -- but its limit must also be in <span class="math-container">$S$</span> because <span class="math-container">$S$</span> is closed.</p>
<p>It's a little tricky because the notion of closed sets and compact sets are intuitively very similar.</p>Fri, 06 Sep 2019 09:17:08 GMThttps://math.stackexchange.com/questions/212181/-/3346082#3346082Abhimanyu Pallavi Sudhir2019-09-06T09:17:08ZSupplementary definitions: compactness, compactification, hemicompactness
https://thewindingnumber.blogspot.com/2019/08/supplementary-definitions-compactness.html
0Let's think about the <b>Bolzano-Weierstrass theorem</b> in real analysis (every sequence in a compact set has a convergent subsequence). What would it take to generalise this theorem to a topological space $X$?<br /><br />First, let's write down the statement: we know that <i>sequences</i> aren't very fundamental to arbitrary topological spaces, so the statement we're looking for would be something like: <b>every net in a compact set has a convergent subnet.</b> The question is what the definition of "compact" is a general topological space generalising the "closed and bounded" definition for metric spaces (and how to prove the generalised statement from this definition).<br /><br />Let's try to formulate some equivalent ways of stating that. It is trivial to see that the filter associated with a subnet is a filter refinement -- so we can state it as: <b>every filter in a compact set has a convergent refinement.</b> This also implies that <b>every ultrafilter is convergent</b> (as an ultrafilter has no proper refinement), and by the ultrafilter lemma, the implication is an iff. Of course equivalently, <b>every ultranet is convergent</b> (where an ultranet is obviously a net that for all $S$ is either eventually in $S$ or in $S'$).<br /><br /><div class="twn-pitfall">You might think of ultranets (nets corresponding to ultrafilters) as all sorts of stuff, like "all convergent nets are ultra", or as generalisations of monotonic sequences -- all these are wrong, and it's a useful exercise to write down counter-examples to this on the real line (hint for the monotonic sequences thing: take the union of a bunch of sets such that a tail set of your sequence is in neither this union nor its complement). You might also think that the "ultrafilter" formulation of the statement is related to the "halving intervals" proof of the Bolzano-Weierstrass theorem, but the filter base generated by this mechanism does not generate an ultrafilter (construct a set that isn't in the filter and its complement isn't either).</div><br />It should be clear that if we only wanted <i>sequences</i> having convergent subsequences, requiring the convergent refinement for <i>every</i> filter is only necessary needed if every filter has a corresponding sequence, and having a convergent refinement only suffices if every filter has a corresponding sequence -- <a href="https://thewindingnumber.blogspot.com/2019/08/topology-ii-kuratowski-closure-topology.html">recall that this occurs precisely in a <b>first-countable space</b></a>. The statement here is that <b>in and only in (inn?) a first-countable space, compactness implies sequential compactness</b>.<br /><br />OK -- we still haven't discovered what this generalised "compactness" is. What is the hypothesis of the generalised Bolzano-Weierstrass theorem?<br /><br />Let's consider two example spaces where the Bolzano-Weierstrass theorem doesn't apply:<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://1.bp.blogspot.com/-2m8xVdBtDwU/XWdaD3OuzoI/AAAAAAAAFsI/0CdUa43sB4UgMabz1O2egvYKS3HAVgpxQCLcBGAs/s1600/bw%2Bcounter%2B-%2B1.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="562" data-original-width="722" height="249" src="https://1.bp.blogspot.com/-2m8xVdBtDwU/XWdaD3OuzoI/AAAAAAAAFsI/0CdUa43sB4UgMabz1O2egvYKS3HAVgpxQCLcBGAs/s320/bw%2Bcounter%2B-%2B1.png" width="320" /></a></div><div class="separator" style="clear: both; text-align: center;"><a href="https://1.bp.blogspot.com/-EbWrNvAAR3A/XWdanTEToTI/AAAAAAAAFsQ/5JqzxZFkClITXhfe4JfuKLwZix_SYuVVgCLcBGAs/s1600/bw%2B-%2Bcounter%2B-%2B2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="414" data-original-width="351" height="320" src="https://1.bp.blogspot.com/-EbWrNvAAR3A/XWdanTEToTI/AAAAAAAAFsQ/5JqzxZFkClITXhfe4JfuKLwZix_SYuVVgCLcBGAs/s320/bw%2B-%2Bcounter%2B-%2B2.png" width="271" /></a></div>These two are "basically the same" kind of set -- a set with some <b>limit points removed</b>. Except in the second case, we can actually say "it's a set with some limit points removed", because these limit points exist in the space the set is in, while in the first case, the limit points have been removed from the containing space <i>itself</i>. You know, let's actually just talk in terms of the subspace topology on the set from now on to avoid having to make a distinction between the two "cases".<br /><br />Well, can we recover some notion of the limit point after it's been removed? A trick you may be familiar with to find limits is to take intersections -- we could just say that if this is true for all proper filters $\phi$, we have a compact set:<br /><br />$$\bigcap_{N\in\phi}N\ne\varnothing $$<br />Right?<br /><br />Not right. This intersection will actually every often be empty -- it only measures if the net has a <b>constant subnet</b> (check that this is true). What we need to do is to consider the <b>closures</b> of each set in the filter, so that the limit points, if any, are actually included -- and since the closures are members of the filter too (being supersets), we can just consider the <b>intersection of all closed members of the filter</b>.<br /><br /><div class="twn-pitfall">Note how we cannot just say something about the closure differing from the sequence -- do you see why?</div><br />So we can write e.g. <br /><br />$$\forall \phi\ne \pi(S), \bigcap_{\mathrm{closed}\ N\in\phi}N\ne\varnothing $$<br />Now while this is a perfectly good condition on its own, we see that it contains some redundant sets -- we're taking an intersection, so this kind of naturally "subsumes" the filter's defining properties of closure under finite intersections and supersets. We can just get rid of these sets and consider a bunch of sets that generate the filter -- this is called a <b>filter sub-basis</b>. Any set of sets generates a filter, but we specifically want to generate a proper filter -- this requires that every finite intersection of sets in the filter sub-basis is non-empty -- this is called the <b>finite intersection property</b> (FIP).<br /><br />So we have our definition of a compact set: <b>any family of closed sets satisfying the finite intersection property has a nonempty intersection</b>.<br /><br />Or we could state in terms of the contrapositive: <b>any family of closed sets with empty intersection has a finite subfamily with empty intersection</b>.<br /><br />And then we could take the open-closed dual of the statement: <b>any open cover of $S$ has a finite subcover</b>.<br /><br />The last one is the most common formulation of compactness.<br /><ul></ul>What the Bolzano-Weierstrass theorem really gives us is an interpretation of compact sets as "kinda" like finite sets -- not in terms of the number of points, but in terms of the amount of "space". You have only a finite amount of "space" to move around in, you can't just <b>escape to infinity</b>, where <b><i>infinity</i> is a point not in the set</b>. The Bolzano-Weirstrass theorem is really a "generalisation of the infinite pigenhole principle" (which is precisely the statement of having "limited space"), and in fact yields the latter for the discrete topology where the only convergent sequences are the constant ones.<br /><br />So in contexts where we want to "generalise" facts about finite sets, like in Lie theory, it's natural that compact sets are of importance. To really drive the point home, consider the following reformulation of the open cover definition:<br /><br /><b>For a family of sets, a compact union of some of these sets has any property that every finite union of them does.</b><br /><b><br /></b> Or further, defining an <b>extensive property</b> as a property of open sets such that $P(S)\land P(T)\implies P(S\cup T)$, a <b>locally true</b> property as a property satisfied by some neighbourhood of every point in a space and a <b>globally true</b> property as a property satisfied by the universal set:<br /><br /><b>Every locally true extensive property is globally true iff the space is compact.</b><br /><b><br /></b> This formulation of the compactness leads to several well-known results about compact sets, including: <i>A continuous real-valued function on a compact set is bounded</i>.<br /><br />More on the local/global correspondence at <a href="https://www.math.ucla.edu/~tao/preprints/compactness.pdf">Terry Tao, "Compactness and Compactification"</a>.<br /><br />So here's a list of things we've seen are equivalent to compactness:<br /><ul><li>$S$ is a compact set.</li><li>Every locally true extensive property on $S$ is globally true.</li><li>If $S$ is a union of sets, it has any property that all finite unions of them have.</li><li>All open covers of $S$ have a finite subcover.</li><li>All families of closed sets with the FIP in $S$ have a nonempty intersection.</li><li>All ultrafilters on $S$ converge.</li><li>All filters on $S$ have a convergent refinement.</li><li>All convergent nets in $S$ have a convergent subnet.</li></ul><div><br /></div><hr /><br /><b>Compactification</b><br /><div><b><br /></b></div><div>The obvious question is how we can make a space compact by adding in some points, like we can in the "subset without its entire boundary" example (just take the closure). In this sense, compactification is a "generalisation" of closure, where instead of just adding points that already exist in the space, we add completely new points: points we call <b>infinity</b>. Making the correspondence precise, we are asking for an embedding of the space such that the closure of the embedded set is the "compactification" (the minimum such embedding <i>is</i> the compactification).</div><div><br /></div><div>Obviously, we have some freedom as to how to make this compactification -- as to how we can "glue the loose ends" of the space. For example, we could assign a single point $\infty$ to both sequences increasing without bound and those decreasing without bound (compactifying $\mathbb{R}$ into a circle), or we could define $+\infty$ and $-\infty$ differently, compactifying it into a interval. </div><div><br /></div><div>There are still questions on how we ensure, in general, that e.g. $(n)$ and $(n^2)$ both "converge" to the same point.</div><div><br /></div><div>Are they necessarily the same point?</div><div><br /></div><div>Suppose we add a point, call it $\infty$ to represent the limit of the sequence $(n)$. What does it mean to say that $(n)$ converges to $\infty$? It means that $(n)$ is eventually in every neighbourhood of $\infty$ -- every net corresponds to a filter, and in this case we have the neighbourhood filter of $\infty$ -- the set of all sets containing a cofinite number of positive integers. Similarly, $(n^2)$ generates the filter of all sets containing a cofinite number of squares.</div><div><br /></div><div><i>These are different points.</i> Could we <i>make</i> them the same point? For this, we'd need both $(a_n)$ and $(b_n)$ to be eventually in every neighbourhood of $\infty$ -- this corresponds to the filter of sets containing all but a finite number of points in each sequence. So if we wanted two infinities, $+\infty$ and $-\infty$, the neighbourhoods of $+\infty$ would be the filter of sets containing all but a finite number of points of any sequence diverging to $+\infty$ -- equivalently:<br /><br />$$\begin{array}{l}N( + \infty ) = \{ S\mid \exists r,\forall x > r,x \in S\} \\N( - \infty ) = \{ S\mid \exists r,\forall x < r,x \in S\} \end{array}$$<br />But if we just wanted a single point at infinity,<br /><br />$$N(\infty)=\{S\mid\exists r,\forall |x|>r, x\in S\}$$<br />How would this <b>one-point compactification</b> work in general? We'd like to assign $\infty$ as the limit in $X\cup\{\infty\}$ of every net that does not have a limit point in $X$ (this will automatically also create a limit point at $\infty$ for nets that diverge but also have a limit point in $X$, like $n\sin(n)$) -- i.e. we want a filter that every divergent net is eventually in. Well, the point of the Bolzano-Weierstrass theorem is that a divergent net <b>eventually escapes every compact set</b>, i.e. it is eventually in every cocompact set. So in general, we have a neighbourhood <b>basis for the point at infinity</b>:<br /><br />$$B(\infty)=\{S\cup\{\infty\}\mid S\subseteq\Phi_X\land \mathrm{compact}\ S'\}$$<br />Or in terms of open sets,<br /><br />$$\Phi_{X\cup\{\infty\}}=\Phi_X\cup\{S\cup\{\infty\}\mid\mathrm{compact}\ S'\}$$<br />This is called the <b>one-point compactification</b> or the <b>Alexandroff compactification</b> of $X$.<br /><br />But the one with two infinities was cool too. The idea was that the two infinities at the far ends of $\mathbb{R}$ are somehow "disconnected" -- this is in contrast to e.g. $\mathbb{R}^{n>1}$, where the "limits" of any divergent sequence <i>must</i> be connected by a giant loop at infinity. If we want each infinity to be a connected component of its own, what we need is a filter base of connected sets converging to a point at infinity.<br /><br />What do I mean? In general, all compactifications are obtained by considering some <b>partitions by non-compact sets of co-compact sets</b> -- in the case of the one-point compactification, these subsets are the trivial ones. In the case of the "plus infinity, minus infinity" compactification, the partition is the set of <b>connected components</b>.<br /><br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://1.bp.blogspot.com/-zcc5chPEu_0/XXPONNzxg3I/AAAAAAAAFtQ/NU71X-grBVgTWw0dRAXWQ9pGpy7sFFPfwCLcBGAs/s1600/compactification%2B-%2Bend.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="484" data-original-width="728" height="265" src="https://1.bp.blogspot.com/-zcc5chPEu_0/XXPONNzxg3I/AAAAAAAAFtQ/NU71X-grBVgTWw0dRAXWQ9pGpy7sFFPfwCLcBGAs/s400/compactification%2B-%2Bend.png" width="400" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Illustration of why we need a partition of the co-compact sets -- the above is <i>not</i> a partition, and there are sequences escaping every one of the four filters drawn. This is actually an illustration of why the end-compactification of the plane has only one infinity.</td></tr></tbody></table>OK, so how do we actually make this construction? The thing is that we have co-compact sets like $(-\infty,-2)\cup(-1,1)\cup(1,\infty)$ -- but $(-1, 1)$ is not really a connected component we care about. We care about the <b>connected components of a co-compact set that are a part of a chain that every co-compact set has connected components in</b>. One way to do this is as follows:<br /><ol><li>Construct an infinite nested basis $U_1\supseteq U_2\supseteq \dots$ of co-compact sets -- i.e. so that every co-compact set contains one of these sets.</li><li><i>Each chain</i> $N_1\supseteq N_2\supseteq\dots$ of connected components is a neighbourhood basis for a point at infinity, called an <b>end</b>.</li></ol>This is the <b>end compactification</b>.<br /><br />There are just three questions:<br /><ol><li>Is this independent of the co-compact set basis we use?</li><li>Does such a basis always exist?</li><li>Is the result always compact (this includes questions you may have about the existence of connected component chains, etc.)?</li></ol>To prove the first, it suffices to show that for any two co-compact bases $U$ and $V$, if a set contains some $N_i$, a set in some connected component chain of $U$, it contains an $M_j$, a set in some connected component chain of $U$ (this creates a natural bijection between the ends arising from $U$ and those from $V$). Because connected components are a partition, it suffices to show that for any two co-compact bases $U$ and $V$, if a set contains some $U_i$, it must contain a $V_j$ -- which is clearly true. So: <b>TRUE</b>.<br /><br />For the second -- this is equivalent to asking the dual question: is there a chain of compact sets such that every compact set is contained in a set in the chain? A simpler, equivalent way to phrase the question is: is there a chain of compact sets whose interiors cover $X$? This is called <b>exhaustion by compact sets</b>, or <b>hemicompactness</b> -- and as you may have guessed, <i>not</i> all spaces have it. So: <b>FALSE</b>.<br /><br />TBC: why 3 is false, general version, Gromov boundary, Stone-Cech, why Hausdorff compactifications are preferred, add labels<br /><br />A stupid joke -- the seven Cs: closed, compact, connected, convergent, continuous, covers, co-. Covers and closed are actually repeated though aren't they?</div>bolzano-weierstrasscompactnessfiltersfinite intersection propertylimit pointsnetstopologyThu, 29 Aug 2019 09:36:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-1605067931625555034Abhimanyu Pallavi Sudhir2019-08-29T09:36:00ZComment by Abhimanyu Pallavi Sudhir on Motivation for the Definition of Compact Space
https://math.stackexchange.com/questions/1409837/motivation-for-the-definition-of-compact-space/1410256#1410256
@Massimo wasn't asking you to replace "is" with "should be" -- the point is that compactness says that every collection of closed sets <i>that has</i> the finite intersection property has nonempty intersection. But even otherwise, I don't see what you're trying to say with the answer -- what does "thinking of compact sets as closed" have to do with using a definition that involves the phrase "closed sets"?Thu, 29 Aug 2019 05:55:46 GMThttps://math.stackexchange.com/questions/1409837/motivation-for-the-definition-of-compact-space/1410256?cid=6869282#1410256Abhimanyu Pallavi Sudhir2019-08-29T05:55:46ZSupplementary definitions: connectedness, clopen sets
https://thewindingnumber.blogspot.com/2019/08/supplementary-definitions-connectedness.html
0What does it mean for a space to have $n$ "separate" parts?<br /><br />Well, let's start from the basics -- how do we count the $n$ parts? We label them $1,...n$, i.e. we have a surjection $\ell:X\to [n]$. Now, for any point in $X$ mapping to some $k\le n$, we should be able to jiggle it around a bit so that it still maps to $k$, otherwise the other "part" mapping to something else will be connected to this one. In other words, $\forall k, \ell^{-1}(\{k\})\in\Phi_X$.<br /><br />Thus, we say that $X$ has $n$ <b>separate components</b> if there exists a <b>continuous surjective labeling function</b> $\ell:X\to [n]$, where $[n]$ is equipped with the discrete topology -- equivalently, we say that $X$ is the <b>union of $n$ nonempty disjoint open sets</b> (the "nonempty" part comes from the "surjective" part) (we'll call this an <i>open partition</i>). We can then define a <b>connected space</b> as one that cannot be written as an open partition.<br /><br />What about a set being connected? If we similarly define maps from the set to a labeling topology, we can see that what we're looking for here is precisely the notion of a subspace topology -- a set is a <b>connected set</b> if taken as a subspace, it is a connected space. This is clearly equivalent to saying that the set cannot be <i>covered</i> by two non-empty disjoint open sets, i.e. does not have a <i>open disjoint cover</i>.<br /><br />Let's define three parallel concepts to define the notion of a <b>connected component</b>:<br /><ul><li>A <i>maximal component partition (i)</i> of a space $X$ is an open partition of $X$ with the largest possible value of $n$ -- a <i>maximal component (i)</i> is a set in this partition. </li><li>A <i>open connected partition (ii)</i> of a space $X$ is an open partition of $X$ such that each component is connected -- a <i>open connected component (ii)</i> is a set in the partition.</li><li>A <i>connected component (iii)</i> of a space $X$ is a connected subset $S$ all of whose proper supersets are disconnected. A <i>connected component list (iii)</i> is the list of all such distinct subsets.</li></ul>(You can tell (i) is going to be ugly -- maximums and all don't play well with infinite cardinalities)<br /><br />So many questions.<br /><ol><li><b>Does a maximal component partition (i) always exist?</b> If $n$ must be an integer, then FALSE (e.g. $\mathbb{Q}$); If $n$ is any cardinal number, then TRUE (as the cardinality of the space is an upper bound). Note that with cardinal numbers, just because there is a maximum $n$ doesn't mean there is a maximal such partition, i.e. one that cannot be partitioned any further. That would be a open connected partition (ii).</li><li><b>Does an open connected partition (ii) always exist?</b> FALSE. E.g. $\mathbb{Q}$ with the standard topology -- any open set contains an infinite number of rational numbers, which is clearly disconnected.</li><li><b>Is a connected component list (iii) a partition of $X$?</b> TRUE.</li><ol type="a"><li><b>(Disjoint)</b> The union of two distinct connected components is disconnected by assumption, so can be written as an open partition, which create an open disjoint cover of each connected component -- this can only be true of an open set if the sets in the cover are precisely the connected components -- since they are disjoint, the conclusion follows. </li><li><b>(Union)</b> There is a unique connected component around each point, and it is the union of all connected sets containing the point (Proof: suffices to show this union is connected. An open disjoint cover of the union would be an open disjoint cover of each of these sets, and since none of these sets are disjoint, this would imply some of them are disconnected, contradiction.), and is clearly nonempty. The union of the connected components around each point contains each point.</li></ol><li><b>Sameness between partitions (notation: <i>is (i) implies is (ii)</i>)</b></li><ol type="a"><li><b>(i) imp (ii)?</b> FALSE in general, e.g. $\mathbb{Q}$.</li><li><b>(i) imp (iii)?</b> FALSE in general, e.g. $\mathbb{Q}$. </li><li><b>(ii) imp (iii)? </b>TRUE.</li><li><b>(ii) imp (i)? </b>TRUE. Suffices to show every open partition has at most as many sets as an open connected partition (ii). From 9, there exists a (ii) with at least as many sets than the given partition. From 6, this (ii) is unique.</li><li><b>(iii) imp (i)?</b> FALSE in general, e.g. $\mathbb{Q}$.</li><li><b>(iii) imp (ii)?</b> FALSE in general, as above, e.g. $\mathbb{Q}$. But if (ii) exists, then TRUE, by 4c and 7.</li></ol><li><b>Is a maximal component partition (i) unique up to permutation?</b> FALSE, as we saw with $\mathbb{Q}$.</li><li><b>Is an open connected partition (ii) unique up to permutation?</b> TRUE, by 4c and 7.</li><li><b>Is a connected component list (iii) unique up to permutation?</b> TRUE, from the "uniqueness" lemma in 3b.</li><li><b>Is every set $S$ in an open partition a union of maximal components (i)?</b> Ambiguous since the maximal components are not unique. If we want this to be true for any maximal component partition, this clearly FALSE, e.g. $\mathbb{Q}$. If we're saying there exists a maximal component partition such that this is true, then TRUE -- just consider the maximal component partitions of each such set and note that together they form a maximal component partition of the space.</li><li><b>Is every set $S$ in an open partition a union of open connected components (ii)?</b> TRUE by 4c and 10.</li><li><b>Is every set $S$ in an open partition a union of connected components (iii)?</b> By 3, it suffices to show that every connected component is either disjoint from $S$ or contained in it. This is true as otherwise their union would be a connected superset of the connected component.</li></ol>So here's our conclusion from all this: (iii) should be our definition of a <b>connected component</b> -- a connected subset whose supersets are all disconnected, or by construction, the union of all connected sets containing a given point. When these components are open, they are the same as in (ii) (which is the only time (ii) exists) -- they are open sets that partition your space. And when the number of connected components is finite, they are the same as (i).<br /><br />The reason we initially thought the definitions would all be equivalent is that we were only visualising cases with a <b>finite</b> number of connected components. In this case, it's easy to see that the connected components are open -- just keep partitioning with open sets until you have as many as the number of connected components. If you can partition any further, you have a open disjoint cover of a connected component, which is a contradiction. This construction cannot be reproduced with $\mathbb{Q}$.<br /><br />By the way, we've used (and proven) the following obvious fact throughout the above series of exercises: <b>the union of two non-disjoint connected sets is connected</b>.<br /><br />Here are some more facts:<br /><ol><li><b>Closure of a connected set is connected.</b> (i.e. something touching a connected set is "connected to it") Suppose not. Then the disjoint open cover of $\mathrm{cl}(S)$ is a disjoint open cover of $S$ unless $\mathrm{cl}(S)-S$ is open, which it isn't (as every open set containing a member of $\mathrm{cl}(S)$ intersects $S$).</li><ol type="a"><li>(Corollary) <b>A connected component is a closed set.</b></li><li>(Corollary) <b>The union of two touching connected sets is connected.</b></li></ol><li>The topology on $X$ is the disjoint union topology of those of the disjoint open sets partitioning it. (exercise left to the reader)</li><ol type="a"><li>(Corollary) If a space $X$ can be partitioned into $n$ disjoint open sets each homeomorphic to $A$, its topology is the product topology $A\times [n]$ where $[n]$ has the discrete topology.</li></ol></ol>We also have the following interpretation of connected components: they are the <i>equivalence classes</i> of the relation "there exists a connected set containing both points".<br /><br />Make sure you understand why the following are true, and why the "finite" qualifiers are needed where we've used it.<br /><ul><li>A connected component, and thus a finite union of connected components, is always closed.</li><li>If there are only a finite number of connected components, a connected component, and thus a (finite) union of connected components is always open.</li></ul><div>Also: <b>continuous functions preserve connectedness</b> (consider the preimage of a connected component). This is a generalisation of the <b>intermediate value theorem</b> (do you see why?).</div><hr /><b><br /></b> <b>Clopen sets</b><br /><br />You may have noticed that when we have a finite number of connected components, each component is both closed and open. Closed as always, because nothing touches a connected component. Open, because the complement is a finite union of closed sets and thus closed. When we have infinite components, sometimes some infinite unions act this way. Sets that are both open and closed are called <i>clopen</i>.<br /><br />In fact, every <b>clopen set $S$ is necessarily a union of (possibly infinite) connected components</b>. This is an instructive proof to work through -- the proof you'd see is: suppose for contradiction that $\exists x\in X$ touching both $S$ and $S'$. Then $x\in S$ implies $S'$ not closed and $x\in S'$ implies $S$ not closed, contradiction.<br /><br />The intuition behind the proof is this: we can understand a closed set as a set where all convergent nets lying inside converge inside, and an open set as a set where all nets converging inside lie inside. So if we stuck a set "touching" $S$ (i.e. with some point touching them both), then for any point on their border, we could construct sequences on both sides converging to it. Then both $x\in S$ and $x\in S'$ lead to contradictions.<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://1.bp.blogspot.com/-cws3z90Q3lg/XXJ53dtCDmI/AAAAAAAAFtE/7ksvQTi_CBorGu6PnlPgDScr8OztXU9WQCLcBGAs/s1600/clopen.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="454" data-original-width="489" height="297" src="https://1.bp.blogspot.com/-cws3z90Q3lg/XXJ53dtCDmI/AAAAAAAAFtE/7ksvQTi_CBorGu6PnlPgDScr8OztXU9WQCLcBGAs/s320/clopen.png" width="320" /></a></div>The open set/closed set formalism is a way to simplify this whole reasoning without having to worry about nets and their limits.connected componentconnectednessmathematicstopologyWed, 28 Aug 2019 04:35:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-7540523747956706630Abhimanyu Pallavi Sudhir2019-08-28T04:35:00ZTopology III: example topologies, inherited topologies, distinguishability
https://thewindingnumber.blogspot.com/2019/08/topology-iii-example-topologies.html
0Let's think of some example topologies one may introduce on a set -- this is easiest in the open set formalism. Prove the statements we make.<br /><ul><li>The <b>discrete topology</b>, where $\Phi=\wp(X)$, which are also the closed sets. The neighbourhoods are simply the principal filters, i.e. $N(x)=\{S\subseteq X\mid x\in S\}$. The closure operator is just the identity and $\sim$ is just $\in$. The limit of a function is simply its value at the point. All functions <i>from</i> a discrete space are continuous everywhere. This basically adds no structure at all to a set, as each point is essentially "separated" from each other point. Clearly, any bijection between sets acts as a homeomorphism between the corresponding discrete spaces.</li><li>The <b>indiscrete topology</b>, where $\Phi=\{\varnothing, X\}$, which are also the closed sets. The neighbourhoods are $N(a)=\{X\}$. Every point touches every non-empty set, and the closure of a non-empty set is just $X$. Every point is the limit of every function. All functions <i>to</i> an indiscrete space are continuous. In other words, every point is "basically the same", cannot be distinguished.</li><li>The <b>kinda-sorta topology</b>, where $\Phi=\wp(S)\cup \{X\}$ for some $S\subset X$. You should be able to see by now that this topology has a bunch of discrete points and a bunch of indsistinguishable points. A neighbourhood of a point in $S$ touches only the sets containing it, but a neighbourhood of a point outside $S$ touches every non-empty set -- so the closure of a set is just the set united with $S'$. Not very interesting, so I'll stop.</li><li>The <b>cofinite topology</b> on an infinite set, where $\Phi = \{S\mid \mathrm{finite}\ S'\}\cup\{\varnothing\}$ -- the closed sets are the finite sets and $X$ -- a neighbourhood of a point is a cofinite set containing that point. A point touches a set if that set is infinite or contains that point -- so the closure of a finite set is the identity and the closure of an infinite set is $X$. For a function $f:X\to Y$ where $X$ is cofinite, its limit at $a$ is a point such that for each of its neighbourhoods, almost all values of $x$, including $a$, map into it. The function is continuous at $a$ if for each neighbourhood of $f(a)$, almost all $f(x)$ are in it -- it is continuous everywhere if every open set in $Y$ contains almost all $f(x)$. In particular, if $Y$ is also cofinite, the limit can only be equal to $f(a)$, which it is iff the function is finite-to-one at every point besides $a$ -- and a function that is continuous everywhere is a finite-to-one function.</li><li>The <b>cocountable topology</b> on the real numbers is basically the same as above except with "countable" replacing finite everywhere.</li><li>The <b>cobounded topology</b> on a metric space -- again kinda the same idea.</li><li>The <b>co(finite volume) topology</b> on a measurable space -- same idea, I guess? Check and find out.</li><li>The <b>first-n topology</b> on the naturals where a set is open if it is of the form $\{x< n\}$ for some (natural or infinite) $n$. The neighbourhoods of a point $x$ are all the sets containing $\{1,2\dots x\}$, a point touches a set if the set contains a point less than or equal to it, the closure is $\mathrm{cl}(S)=\{x\ge\min S\}$. The limit of a function $f:X\to Y$ ($X$ equipped with the first-n topology) at a point $a$ is $L$ if $f(x)$ is topologically indistinguishable from $L$ for all $x\le a$. If $Y$ is also first-n, then the function's limit is any upper bound on its value for $x\le a$ -- the function is continuous at $a$ if its value at $a$ is higher than that at any lower value, and is continuous everywhere if it is (not strictly) increasing.</li></ul><div>You should get some kind of feel for open sets now -- if you can find an open set containing one point but not another, they can be distinguished in some sense. This is the definition of <b>topological distinguishability</b>.<br /><br />One may then define a partial order on the topologies on a set based on <b>fineness</b> and <b>coarseness</b> (this is called the <b>comparison of topologies</b>), where $\Phi_1\le\Phi_2\iff \Phi_1\subseteq\Phi_2$. The discrete topology is the finest topology and the indiscrete/trivial topology is the coarsest topology. A continuous function $f:X\to Y$ remains continuous if $X$ becomes finer or $Y$ becomes coarser, but a homeomorphism does not remain continuous in either situation (but can if they both change in a specific way)</div><div><br /></div><div>For fun, let's try to visualise the cofinite topology (keep the set $\mathbb{N}$ in mind) -- the only sets with any sort of interiors are the cofinite sets. A <i>finite</i> set is closed, has no interior (it's its own boundary, i.e. it's a bunch of discrete points) and doesn't touch anything outside (because it's closed). An <i>infinite </i>set touches every single point, making them all part of their boundary -- it is closed at the points it contains and open at the points it doesn't contain. This explains the nature of the open and closed sets to us: if a point is contained in a finite set, it is located on its boundary. No matter how large you make the set, as long as it is not cofinite, it cannot contain the point on its interior, as its complement is infinite and thus touches the point.</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://1.bp.blogspot.com/--WNBTBBh0c0/XWD4mlvGaOI/AAAAAAAAFrM/89M7Pfd3Tzc2PtWNAc475M0X16Kij857gCLcBGAs/s1600/cofinite.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="760" data-original-width="764" height="397" src="https://1.bp.blogspot.com/--WNBTBBh0c0/XWD4mlvGaOI/AAAAAAAAFrM/89M7Pfd3Tzc2PtWNAc475M0X16Kij857gCLcBGAs/s400/cofinite.png" width="400" /></a></div><div>You can see that the description above -- while basically a very intuitive explanation of how the topological features of the space interact -- is basically equivalent to rigorous topological arguments. So our axioms of topology really do describe spaces with some very similar properties to ones we're used to dealing with.</div><div><br /><b>More examples: inherited topologies</b></div><div><br /></div><div>We've already seen one kind of inherited topology: that inherited from a metric space -- where the open sets are just sets with some "wiggle-room" for each element.<br /><br />Here are some obvious "inherited topologies" -- topologies determined by some other structure:</div><div><ul><li><b>Subspace topology -- </b>The open sets define a generalised "wiggle room" in a set $X$ -- for a function to be continuous, it must be similar to its value for some "wiggle room" in its domain. In a subspace (subset) $S$ of that space, we don't really know or care how a function on $S$ behaves outside $S$ -- for it to be continuous <i>in</i> $S$ at a point, the only wiggle room we care about are is wiggle room inside $S$ (e.g. the Heavside step function is continuous on the nonnegative reals) -- and since the open sets on $X$ define what wiggle room is in this topological space, $\Phi_{S}=\{O\cap S: O\in\Phi_X\}$. Check that the axioms apply.</li><li><b>Disjoint union topology -- </b>You can't really invert the subspace topologies on a partition to necessarily recover the original topology -- you necessarily lose information about how the subspaces stick to each other when you cut it up. But you can still just "stitch together" some topological spaces, i.e. take a disjoint union of some sets -- it's important that you don't cut anything while doing so, so the "stitching" (i.e. the canonical injections) need to be continuous. There are multiple ways to define such a stitching, but the canonical disjoint union topology is the finest among them -- we let <i>all</i> sets $U$ whose pre-images $p_i^{-1}(U)$ in each $X_i$ are open be open. I.e. $\Phi_{\sum X_i}=\{O\mid \forall i, O\cap X_i\in\Phi_{X_i}\}$.</li><li><b>Product topology -- </b>Continuity on $X\times Y$ means being able to wiggle a little bit in any direction on $X\times Y$ and keep the function's value similar. The only way to generally define this notion of a general wiggle space is as an arbitrary union of little "squares" (called <b>open cylinders</b>), like in calculus, i.e. with some abuse of notation: $\Phi_{\prod X_i}=\{\bigcup \prod O_i : O_i \in \Phi_{X_i}\}$.</li><li><b>Quotient topology -- </b>Quite often, we want to "collapse" a topology tying together some nearby or otherwise related points. Such a relation is an equivalence relation $\sim$, and what we need is a topology on the equivalence classes. The idea behind the topology is that the equivalence classes of nearby points are still nearby, i.e. so that the function $q(x)=[x]$ is continuous while not becoming too indiscrete: i.e. $\Phi_{X/\sim}=\{S\subseteq X/\sim\mid q^{-1}(S)\in\Phi_X\}$.</li><li><b>Initial topology -- </b>The subspace topology can be understood as asking that the embedding of the subspace into the parent space is continuous, while not adding too many open sets. This notion can be generalised -- we can ask for the coarsest topology on a set $X$ that makes a family of functions from $X$ continuous. This is a generalisation of the subspace and product topologies</li><li><b>Final topology -- </b>Dually, we may ask for the finest topology on a set $X$ that makes a family of functions to $X$ continuous. This is a generalisation of the disjoint union and quotient topologies.</li></ul></div><div><b>Elementary theorems (exercises)</b></div><div><ol><li>Figure out if these statements are True/False:</li><ol type="a"><li>A set is closed iff it is closed under limits of nets.</li><li>The closure of a set equals the intersection of all closed sets containing the set.</li><li>The interior of a set equals the union of all open sets contained in the set.</li><li>A set equals the union of all closed sets contained in the set.</li><li>A set equals the intersection of all open sets containing the set.</li><li>Having an empty interior implies being closed.</li><li>Having an empty interior is equivalent to being a subset of its boundary.</li><li>$\forall N\in N(x), y\in N(x)\implies x=y$. </li><li>Limits are unique.</li><li>If $S$ and $T$ are disjoint from each other's closures, their closures are disjoint.</li><li>The product topology is the initial topology with respect to the projection maps.</li><li>$\sum_I X$ with the disjoint union topology is homeomorphic to $X\times I$ (with $I$ endowed with a discrete topology) with the product topology.</li></ol><li>Explain (not prove!) why the interval $[0,1)$ is not homeomorphic to $S^1$.</li></ol><b>Answers</b><br /><ol><li>True/False:</li><ol type="a"><li><b>TRUE.</b> (forward) Suppose a net in $S$ converges to a point $a$ in $S'$. Since $S'$ is open, there is an open set around $a$ contained in $S'$, and this open set must have points of $a$. (backward) Suppose $S$ is not closed, so there is a point in $S'$ whose neighbourhoods all intersect $S$. We can use these neighbourhoods to construct a net converging to it.</li><li><b>TRUE.</b> Suffices to show that the closure of $S$ is closed, and is contained in every closed set containing $S$. (1) The set of all points around which there exists an open set not intersecting $S$ is open is an open set, as it is the union of all these "existing" open sets. (2) Take a point $p$ outside a closed set $C\supseteq S$. Then $C'$ is an open set containing $p$ that does not intersect $S$, thus $p$ does not touch $S$. So every point that does touch $S$ is in $C$.</li><li><b>TRUE.</b> Dual to the above.</li><li><b>FALSE.</b> See e.g. the trivial topology. True for a metric space, though.</li><li><b>FALSE.</b> Dual to the above.</li><li><b>FALSE.</b> Remember the infinite non-cofinite sets in the cofinite topology? This isn't even true on a metric space (poke a hole in a curve).</li><li><b>TRUE.</b> Obviously.</li><li><b>FALSE.</b> If you look through the proof for metric spaces, you'll see that it relies on the ability to create an open set that separates the two non-equal points. This is topological distinguishability -- the kind of space for which this statement is true is a <i>T1 space</i>; if we had the symmetric condition, it would be true in a <i>T0 space</i>.</li><li><b>FALSE. </b>Remember the first-n topology? The proof requires having disjoint neighbourhoods of distinct points -- this requires being a <i>T2 space</i>, also known as a "Hausdorff space".</li><li><b>FALSE.</b> Consider $\mathbb{R}^-$ and $\mathbb{R}^+$.</li><li><b>TRUE.</b> Requiring the projection maps to be continuous just means that each $p_i^{-1}(U)$ is open. The topology generated by these is the topology described -- so what is it? Intersecting these sets across $i$ as $\bigcap_i p_i^{-1}(U)$ leads to the open cylinders, which generate the product topology.</li><li><b>TRUE.</b> $p_i^{-1}(U)\cap X_i = U$.</li></ol><li>Because the preimage of the open set $[0,1/2)$ is not open in the circle. This goes back to the fact that because we're cutting the circle, the neighbourhood of 0 becomes "easier". Again, this is <i>not</i> a proof, just a proof that cutting doesn't work.</li></ol></div>cofinite topologymathematicssubspace topologytopological indistinguishabilitytopologySun, 25 Aug 2019 20:23:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-548142772842721012Abhimanyu Pallavi Sudhir2019-08-25T20:23:00ZTopology II: Kuratowski closure topology, nets, neighbourhood basis
https://thewindingnumber.blogspot.com/2019/08/topology-ii-kuratowski-closure-topology.html
0<a href="https://thewindingnumber.blogspot.com/2019/08/topology-limits-continuity-and.html">So far</a>, we've described two axiomatisations of toplogy: in terms of <b>neighbourhoods</b> and in terms of <b>open sets</b>. While the neighbourhoods definition was a natural extension of our understanding of topological structure being in terms of limits, the open sets definition is kind of hard to wrap your head around, as far as I can see. A continuous function doesn't even preserve open sets (the definition of a continuous function isn't "preserves neighbourhoods" in the neighbourhood formulation either, but at least we know where it comes from). It's not openness in particular that's important, we could formalise topology in terms of their complements the closed sets, too.<br /><br />In the following set of exercises, we will build an alternative axiomatisation of topology based on the notion of <b>touching</b>, which will perhaps give us some explanation of why the notion of openness and closedness are important to topology.<br /><ol><li>Consider the relation of "touching" between a point and a set (it can't be between a point and a point, but it can be with a set -- kind of for the same reason that a sequence tends to a point but there is no point in the sequence that equals that point). How would you write this in terms of the open set topology? (<b>Ans:</b> every open set containing $x$ intersects $S$)</li><li>Now formulate an open set in terms of touching. (<b>Hint:</b> formulate a closed set first, the open set is its complement) (<b>Ans:</b> no point in $S$ touches $S'$)</li><li>Great. Find out what axioms we need on the touching operation $\sim$ to prove the three axioms of open sets.</li><ol type="a"><li>$X$ is open requires -- (<b>Ans:</b> $\not\exists x \sim \varnothing$)</li><li>$O_1\cap O_2\in\Phi$ requires -- (<b>Ans:</b> $x\sim S\cup T\Rightarrow x\sim S \lor x\sim T$)</li><li>$\bigcup_\lambda O_\lambda \in\Phi$ requires -- (<b>Ans:</b> $x\sim S\subseteq T \Rightarrow x\sim T$)</li></ol><li>Given a touching relation $\sim$, we can produce the set of open sets $\{S\mid \forall x\in S, x\not\sim S'\}$. From this, we can produce the relation $\bar\sim$, given by: $\forall T\, \mathrm{st.} (x\in T \land \forall y\in T, y\not\sim T'), S\cap T\ne\varnothing$. Try proving this is equivalent to $x\sim S$, and see what axioms you need.</li><ol type="a"><li>$\Leftarrow$ requires -- (<b>Ans:</b> None. Suppose $S\cap T$ were empty. Then $S\subseteq T'$, so by 3c-ans, $x\sim T'$, contradiction.)</li><li>$\Rightarrow$ requires -- (<b>Hint:</b> >given such a $T$, construct a smaller $T$ whose intersection with $S$ is empty if $x\not\sim S$) (<b>Ans:</b> $S\subseteq\mathrm{cl}(S)$ and $\mathrm{cl}(\mathrm{cl}(S))\subseteq\mathrm{cl}(S)$. Suppose $x\not\sim S$. Then for any $T$ satisfying the LHS (e.g. the universe by 3a-ans), consider $T\cap\mathrm{cl}(S)'$ where $\mathrm{cl}(S)$ is the set of all points touching $S$ -- $x\in T\cap \mathrm{cl}(S)'$ by assumption; now given $y\in T\cap \mathrm{cl}(S)'$ we want to show $y\not\sim T'\cup \mathrm{cl}(S)$ (for this to contradict the claim that $S\cap(T\cap\mathrm{cl}(S)')\ne\varnothing$, we need that $S\subseteq\mathrm{cl}(S)$ -- <i>this is new!</i>). Suppose that $y\sim T'\cup\mathrm{cl}(S)$ and apply 3b-ans: $y\sim T'$ contradicts $y\in T$ as $T$ is open; now we just want $y\sim\mathrm{cl}(S)$ to imply $y\in\mathrm{cl}(S)$ -- <i>this is new!</i>)</li></ol><li>So rewrite our axioms in terms of the <b>closure operator</b> $\mathrm{cl}$ as follows -- and this is completely equivalent to our earlier "touching" description of the closure operator as $x\sim S\iff x\in\mathrm{cl}(S)$:</li><ol type="a"><li>$\mathrm{cl}(\varnothing)=\varnothing$</li><li>$\mathrm{cl}(S\cup T)\subseteq\mathrm{cl}(S)\cup\mathrm{cl}(T)$</li><li>$S\subseteq T\Rightarrow \mathrm{cl}(S)\subseteq\mathrm{cl}(T)$</li><li>$S\subseteq\mathrm{cl}(S)$</li><li>$\mathrm{cl}(\mathrm{cl}(S))\subseteq\mathrm{cl}(S)$</li></ol><li>To get yourself comfortable with this closure operator, check if the following statements are true or false:</li><ol type="a"><li>$\mathrm{cl}(S)\cap\mathrm{cl}(T)\subseteq\mathrm{cl}(S\cap T)$ (<b>Ans: </b>False -- consider two disjoint sets sharing a boundary)</li><li>$\mathrm{cl}(S\cup T)=\mathrm{cl}(S)\cup\mathrm{cl}(T)$ (<b>Ans:</b> True -- in fact can replace 5b and 5c)</li><li>$\mathrm{cl}(\mathrm{cl}(S))=\mathrm{cl}(S)$ (<b>Ans:</b> True -- it's also equivalent to the statement that $\mathrm{cl}(S)$ is closed, i.e. that no point in $\mathrm{cl}(S)'$ touches $\mathrm{cl}(S)$ -- can you see why?)</li><li>$S\subseteq\mathrm{cl}(T)\Rightarrow \mathrm{cl}(S)\subseteq\mathrm{cl}(T)$ (<b>Ans:</b> True -- in fact can replace 5c and 5e)</li><li>$\mathrm{cl}(S\cap T)\subseteq\mathrm{cl}(S)\cap\mathrm{cl}(T)$ (<b>Ans:</b> True -- in fact the arbitrary-intersection version of this is the reason we have the equivalent 5c, from 3c-ans, and can replace it)</li></ol><li>From 5c and the answer to 6a, you might see an analogy with <b>limits of sequences</b> -- indeed, a "set" is kind of a sequence, and its closure is to add the <b>limit points</b> of the sequence to the set. Indeed, the definition of a <b>continuous function</b>, as we will see, is $f(\mathrm{cl}(S))\subseteq\mathrm{cl}(f(S))$, i.e. all <b>limit points remain limit points</b> (and for a homeomorphism, the reverse inclusion is also true). This definition should be fairly obvious as it just says "if $x$ touches $S$, $f(x)$ touches $f(S)$", i.e. nothing gets ripped apart in the process. So $\mathrm{cl}$ is <b>a generalisation of a limit</b> to any subset of $X$.</li><li>Well, actually, this is not clearly a stronger condition than "convergent sequence limits are preserved" -- that actually requires that convergent sequences remain convergent, that you don't just add a new limit point like you can for non-convergent sequences.</li></ol><div>But this does get us thinking -- we saw above that "preserves the limit points of every set" fully describes a continuous function. But does "preserves the limits of convergent sequences" -- which we used to motivate the idea of topology in the first place -- still characterise a continuous function? Is $x_n\to a\Rightarrow f(x_n)\to f(a)$ (for all sequences $(x_n)$) equivalent to $x\to a \Rightarrow f(x)\to f(a)$ like it does for standard metric spaces? Certainly the backward implication is correct.</div><div><br /></div><div>Let's go through the proof of the forward implication on metric spaces.</div><div><br /></div><div>Suppose $f$ is not continuous at $a$. So $\exists\varepsilon>0$ such that $\forall\delta>0,\exists x\,\mathrm{st.} |x-a|<\delta, |f(x)-f(a)|\ge\varepsilon$. We want to construct a sequence $(x_n)$ converging to $a$ so that $\forall N, \exists k \ge N, |f(x_k)-f(a)|\ge\varepsilon$. We construct the $n$th element of the sequence from the choice function for $\delta = 1/n$, which is an $x$ within $1/n$ of $a$ such that $|f(x)-f(a)|\ge\varepsilon$. </div><div><br /></div><div>OK -- how can we generalise this to an arbitrary topological space? </div><div><br /></div><div>Suppose $f$ is not continuous at $a$. So $\exists N\in N(f(a))$ such that $f^{-1}(N)\notin N(a)$, i.e. $\forall M\in N(a), \exists x\in M, f(x)\notin N$ (make sure you can tell that these are indeed equivalent). We want to construct a sequence $(x_n)$ converging to $a$ so that $\forall K, \exists k\ge K, f(x_k)\notin N$. Now here's the deal: if we can construct a sequence of $M$s in $N(a)$ that ultimately converge to the point $a$, like we could in the metric case, we are done.</div><div><br /></div><div>What does it mean to "ultimately converge to the point $a$"? We need that the choices are eventually contained in every neighbourhood of $a$ -- or to write it down concisely: we want a sequence of sets $M_i\in N(a)$ such that $\forall N\in N(a), \exists i, M_i\subseteq N$. This is called a <b>neighbourhood basis</b> -- and particularly since the domain of $i$ is the natural numbers, a <b>countable neighbourhood basis</b>.</div><div><br /></div><div>Well, so does every topology admit a countable neighbourhood basis to every point? As it turns out, there are counter-examples.</div><div><br /></div><div>So we've generalised a bit beyond the notion of preserving limits of sequences, and this is OK -- the natural numbers aren't <i>that</i> fundamental, are they? I would say that preserving the limit points of sets as in pt. 7 is a more fundamental notion than preserving the limits of convergent sequences. In any case, different terms exist for different levels of specialisation, such as:</div><div><ul><li>Pre-topological space (where you can have "multiple layers of boundaries" -- this is what happens if you leave out idempotence of closure, the arbitrary union axiom of open sets, or the wiggle-room axiom of neighbourhoods)</li><li>Topological space</li><li>T0-space (distinguishability)</li><li>T1-space</li><li>T2-space or Hausdorff space (uniqueness of limits of nets)</li><li>Alexandrov topology (with arbitrary intersection <i>and</i> union)</li><li>First-countable space (for which the "limits of sequences" thing suffices)</li><li>Second-countable space</li><li>Separable space</li><li>Uniform space</li><li>Metrisable space</li></ul><div>In any case -- although we do not always have a countable neighbourhood basis, we <i>do</i> always have a neighbourhood basis for a point -- the entire neighbourhood filter itself. And while the neighbourhood filter isn't a countable set, it is a <b>directed set</b> (a poset where every two elements have a shared superior). The generalisation of a sequence to a directed domain is called a <b>net</b>. </div><div><br /></div><div>We can define the limit of a net from a directed set $D$ in the same way as usual with filters: $\forall N\in N(a), f^{-1}(N)\in N(+\infty)$, or $\forall N\in N(a), \exists K\in D, \forall k \ge K, x_k\in N$. Then our generalisation of the "limits of a sequence" motivation for topology is:</div></div><div><br /></div><div><b>A map is continuous if it preserves the limits of all convergent nets.</b></div><br />Of course, the convergence of nets is equivalent to the convergence of filters.<br /><br /><hr /><br /><b>Dense sets</b><br /><b><br /></b>Let's discuss a quick application of closure -- recall how $\mathbb{Q}$ is <b>dense</b> in $\mathbb{R}$. This can be formulated in numerous ways, but the simplest way is probably "every open set in $\mathbb{R}$ intersects $\mathbb{Q}$. Does this remind us of something? Yes, of course -- it's the definition of closure in terms of open sets, i.e. $\mathrm{cl}(\mathbb{Q})=\mathbb{R}$.<br /><br />And this is obviously true -- it's the definition of $\mathbb{R}$, isn't it? So there's a very natural explanation for the rationals being dense in the reals.directed setsfiltersfirst-countable spacesKuratowski closure axiomslimit pointslimitsmathematicsneighbourhoodsnetstopologyThu, 22 Aug 2019 18:12:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-1847951145881379814Abhimanyu Pallavi Sudhir2019-08-22T18:12:00ZComment by Abhimanyu Pallavi Sudhir on Why is a topology made up of 'open' sets?
https://mathoverflow.net/questions/19152/why-is-a-topology-made-up-of-open-sets/19173#19173
Ah wait, no it's fine -- your axiom 4 implies the converse of axiom 3, and preservation of binary unions leads to $A\subseteq B\Rightarrow \mathrm{cl}(A)\subseteq\mathrm{cl}(B)$.Wed, 21 Aug 2019 20:16:16 GMThttps://mathoverflow.net/questions/19152/why-is-a-topology-made-up-of-open-sets/19173?cid=847795#19173Abhimanyu Pallavi Sudhir2019-08-21T20:16:16ZComment by Abhimanyu Pallavi Sudhir on Why is a topology made up of 'open' sets?
https://mathoverflow.net/questions/19152/why-is-a-topology-made-up-of-open-sets/19173#19173
Wait -- so in a pre-topology, it's no longer true true that "if $x$ touches $A\subset B$, then $x$ touches $B$"?Wed, 21 Aug 2019 07:32:27 GMThttps://mathoverflow.net/questions/19152/why-is-a-topology-made-up-of-open-sets/19173?cid=847627#19173Abhimanyu Pallavi Sudhir2019-08-21T07:32:27ZComment by Abhimanyu Pallavi Sudhir on Schrödinger equation derivation and Diffusion equation
https://physics.stackexchange.com/questions/144832/schr%c3%b6dinger-equation-derivation-and-diffusion-equation/145217#145217
What? Under the standard definition of a "wave equation", it must be second-order in time, which the Schrodinger equation is not. It may allow wave-like solutions, but it's fundamentally a (Wick rotated) diffusion equation.Tue, 20 Aug 2019 03:53:25 GMThttps://physics.stackexchange.com/questions/144832/schr%c3%b6dinger-equation-derivation-and-diffusion-equation/145217?cid=1121832#145217Abhimanyu Pallavi Sudhir2019-08-20T03:53:25ZTopology I: limits, continuity and homeomorphism, the neighbourhood filter, open sets
https://thewindingnumber.blogspot.com/2019/08/topology-limits-continuity-and.html
0The idea of a topology, although often linked to geometry* in some classifications, is perhaps better understood as something that has fundamentally more to do with groups, vector spaces and such. You know, when talking about groups, what we really care about is the <i>structure</i> -- we don't really care if we're dealing with $U(1)$ or $SO(2)$ or $\mathbb{Z}/2\pi\mathbb{Z}$ -- if two groups are isomorphic, they have the same "structure" and it's this structure we care about. We're interested in the properties of the group that are invariant under isomorphism -- these are the "group theoretic" properties. <br /><br />Anyway -- a group homomorphism $f:G\to H$ is a function that <b>preserves the group structure</b>, or alternatively <b>commutes with multiplication</b>, i.e. $f(x\cdot y) = f(x)\cdot f(y)$ (you might see the "commuting" analogy better if you write $x\cdot y$ as $\mathrm{mult}(x,y)$. Similarly, a vector space homomorphism or "linear transformation" $f:V\to W$ is a function that <b>preserves the linear structure</b> or alternatively <b>commutes with linear combination</b>/commutes with addition and scalar multiplication, i.e. $f(ax+y)=af(x)+f(y)$.<br /><br /><div class="twn-furtherinsight">Here's another example: the homomorphisms of order theory are increasing functions. Do you see why?</div><br /><div class="twn-beg">*: It's not completely unrelated to geometry, though, which as we discussed <a href="https://thewindingnumber.blogspot.com/2019/04/geometry-positive-definiteness-and.html">here</a> is the theory of the invariants of a manifold under some symmetries. Well, symmetries are analogous to isomorphisms, but different -- symmetries form a group, in that the transformation from one state of the manifold to another can be judged to be the "same transformation" as that between two other states, i.e. $gx=y$ and $gx'=y'$. But there is no natural way to "parameterise" group isomorphisms so that you can compose them -- is this what one calls a "groupoid"? I don't know.</div><br />In general, commuting means some sort of preservation or non-interference policy. Here's another example: a <b>continuous function</b> $f:X\to Y$ is a function that <b>commutes with the limit</b>, i.e.<br /><br />$$f\left(\lim_{x\to a}x\right)=\lim_{x\to a}f(x)$$<br />The properties of a space that are invariant under continuous functions are precisely what encompass the theory of <b>topology</b> -- so a <b>topological homomorphism</b> is a "<b>continuous function</b>", and a <b>topological isomorphism</b> is a continuous function with a continuous inverse or a "<b>homeomorphism</b>". The <b><i>structure</i> of a topological space is precisely the limit operation</b>. It's not a binary operation like group multiplication $(\cdot):G^2\to G$ or vector addition $(+):V^2\to V$ or a unary function family like scalar multiplication $(\cdot):K\times V\to V$, but an operator family $(\lim):(\mathcal{I}\to X) \to (\mathcal{I}\to X)$ where $\mathcal{I}$ is fixed to be the domain set.<br /><br /><div class="twn-furtherinsight">We could also phrase this in an equivalent more natural manner -- a continuous function is one that preserves convergent sequences and their limits ("Cauchy" is incorrect on incomplete spaces, e.g. $1/x$ is continuous on $\mathbb{R}^+$ but maps the Cauchy sequence $(1/n)$ to $(n)$).</div><br />Well, our equation above doesn't even make sense for <b>incomplete spaces</b> like $\mathbb{R}^+$ or $\mathbb{Q}$ ($\lim_{x\to a}f(x)$ is not necessarily defined), but we still want to talk about the "topology" of such spaces. In any case, we've talked about spaces without even describing what the objects of the space are, or how the "space" arises from the set (endowing with a metric is too much information that isn't invariant under continuous functions).<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://upload.wikimedia.org/wikipedia/commons/thumb/7/79/Neighborhood_illust1.svg/777px-Neighborhood_illust1.svg.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="753" data-original-width="777" height="310" src="https://upload.wikimedia.org/wikipedia/commons/thumb/7/79/Neighborhood_illust1.svg/777px-Neighborhood_illust1.svg.png" width="320" /></a></div><br />How would we generalise the notion of a limit to an arbitrary space? The key is something called a "<b>filter</b>" specifically a "<b>neighbourhood filter</b>". Filters are interesting mathematical objects with wide mathematical applications, but in this context, the idea behind a neighbourhood filter is that we want a sort of "convergence" of a poset of sets to a point. You would recall from basic real analysis that expression $\min(\delta_1,\delta_2)$ and $\max(N_1,N_2)$ are of importance to many theorems -- this is taking the intersection of two neighbourhoods and forming a new one. This leads to the following axioms:<br /><br /><b>Axioms for the neighbourhood topology</b><br /><ol><li>$\forall N\in N(x), x\in N$ (to relate the neighbourhoods to the points themselves)</li><li>$\forall N\in N(x), \exists M\in N(x), M\subseteq N\land \forall y\in M, N\in N(y)$ (to ensure the sets are only neighbourhoods to points on their "interior")</li><li>$\forall N\in N(x), \forall N' \supseteq N, N' \in N(x)$ (to express the notion of convergence -- we really want a "smallest neighbourhood", but since there is no such thing, we want a sequence of neighbourhoods that get arbitrarily small)</li><li>$\forall N_1, N_2\in N(x), N_1\cap N_2\in N(x)$ (the intersection thing)</li><li>$\forall x\in X, N(x)\ne\varnothing$ (this is not strictly neceessary, but there is a point without neighbourhoods, you could just remove it from the set and not lose anything topological -- do you see why? -- as it is not topologically related to any other point -- do you see why? -- think about what continuous functions do with them)</li></ol><div>The last three are the axioms of a filter, and the first two specify the <i>neighbourhood</i> filter in particular. Then the definition of a limit is:<br /><br />$$\lim_{x\to a}f(x)=L\iff \forall U\in N(L),f^{-1}(U)\in N(a)$$<br />And the definition of continuity is:<br /><br />$$\forall U\in N(f(a)),f^{-1}(U)\in N(a)$$<br />It's a useful exercise to confirm that with the standard neighbourhoods on $\mathbb{R}$, this reduces to the standard definition of the limit. It's also useful to generalise the standard properties of the limit with this definition -- start with the uniqueness of the limit. You'll see that this axiomatisation really does capture the basic "idea" of the limit.<br /><br />(If the second axiom is a bit unclear, the point is that the set $[0,1)$ is not the neighbourhood of the point 0 even though it contains it -- a neighbourhood must contain an epsilon-ball around the point. Convince yourself that this is necessary by constructing a situation where $f^{-1}(U)$ is $[0,1)$ but $f$ is discontinuous.)<br /><br />These axioms are the <i>axioms for a topology defined in terms of neighbourhoods</i>, or a <b>neighbourhood topology</b><b> </b>-- any choice of neighbourhoods satisfying these axioms define a topology on the underlying set. <br /><br />The second axiom is appalling with its number of quantifiers, but it gives some solid idea of what a neighbourhood really is -- one can show it's equivalent to $\forall x\in\mathrm{Int}A, \mathrm{Int} A\in N(x)$ (where $\mathrm{Int}A$ is the interior of $A$, the set of points of which $A$ is a neighbourhood), i.e. that $\mathrm{Int}A$ is an <b>open set</b> and thus equivalent to $\forall N\in N(x), \mathrm{Int} N\in N(x)$. It is then easy to see that this implies the notion in our head that "<i>a neighbourhood of a point is a set that contains an open set containing that point</i>".<br /><br />OK, so we have these (circular!) definitions of neighbourhoods and open sets in terms of each other:<br /><br /><b>Relationship between neighbourhood topology and "associated" open set topology</b><br /><ol><li>An open set is a set that is the neighbourhood of each of its points: $O\in\Phi\iff \forall x \in O, O\in N(x)$</li><li>A neighbourhood of a point is a set containing an open set containing the point: $N\in N(x)\iff \exists O\in\Phi, x\in O\subseteq N$</li></ol>Now, so far we've been starting with neighbourhoods and defining open sets from them with (1) -- and our discussion above shows that this respects (2), but instead, we could start with a definition of topology in terms of open sets and define neighbourhoods from them with (2), then check that it respects (1). So we need to find out what <i>axioms</i> an <b>open set topology</b> must satisfy such that the associated neighbourhoods satisfy the axioms for a neighbourhood topology, and that respects (1) (i.e. such that the open sets arising from these neighbourhoods are the same as the original open sets).<br /><br />To find these axioms, we'll simply try and "prove" each of the neighbourhood topology axioms and see what axioms we could use on the open sets to do so.<br /><br /><b>Figuring out the right open set axioms: Part I</b><br /><ol><li>We want $x\in N(x)$. But this follows from the definition of the neighbourhood in (2), so no axioms on the open sets are imposed here.</li><li>Same as above -- you can trivially confirm that the open set contained in the neighbourhood is the $M$ we want. That these two axioms follow from the definition was the point of the definition (2).</li><li>Same as above.</li><li>What's the open set around $x$ contained by $N_1\cap N_2$? The simplest answer is $O_1\cap O_2$. So this produces an axiom: <b>open sets are closed under finite intersection</b>. </li><li>Every point having a neighbourhood is equivalent to <b>every point being contained in an open set</b>. </li></ol>Now to ensure the $\text{open sets}\to \text{neighbourhoods} \to \text{open sets}$ chain ends up where it started -- what this is basically doing is ensuring the converse $\text{neighbourhood topology}\Rightarrow\text{open set topology}$, because we will be able to show that these "open set axioms" will be true for the the open sets arising from a neighbourhood topology, and the point is to find out what axioms are needed to ensure these are the same as the original open set topology and thus the axioms apply to the original open set topology. Do you see why this is kind of a tricky game? Let's hope it works.<br /><br /><b>Figuring out the right open set axioms: Part II</b><br /><br />The neighbourhoods arising from the open set topology $\Phi$ are $N(x)=\{S\subseteq X\mid\exists O\in\Phi,x\in O\subseteq S\}$. The open sets arising from these are $\Phi' = \{S\subseteq X\mid \forall x\in S, S\in N(x)\}$. Or:<br /><br />$$\Phi'=\{S\subseteq X\mid \forall x\in S, \exists O\in\Phi, x\in O\subseteq S\}$$<br />It is clear that $\Phi\subseteq\Phi'$. We want to show the converse, that $\Phi'\subseteq\Phi$, i.e.<br /><br />$$\forall x\in S, \exists O\in\Phi, x\in O\subseteq S\Rightarrow S\in \Phi$$<br />Or in English: "if every point in $S$ has an open set around it contained in $S$, then $S$ is an open set". Recognising that $S$ is the union of all the $O$'s, it is easy to show that this is equivalent to: if $O_\lambda$ is an arbitrary family of sets in $\Phi$, then $\bigcup_\lambda O_\lambda \in \Phi$, i.e. <b>open sets are closed under arbitrary union</b>.<br /><br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://1.bp.blogspot.com/-p30Ty7KDm1I/XVJ-PNIlyBI/AAAAAAAAFqk/FkRJ6MhfhW4T6wsZvSVr62R1yNIe0vsAQCLcBGAs/s1600/neighbourhoods%2Bunion.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="619" data-original-width="443" height="320" src="https://1.bp.blogspot.com/-p30Ty7KDm1I/XVJ-PNIlyBI/AAAAAAAAFqk/FkRJ6MhfhW4T6wsZvSVr62R1yNIe0vsAQCLcBGAs/s320/neighbourhoods%2Bunion.png" width="229" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">The union of every such blue circle you can create is equal to $S$.</td></tr></tbody></table>So we now have an axiomatisation of topology in terms of open sets, the open sets topology: we have a topology on a set $X$ given by a set of its subsets $\Phi$ satisfying:<br /><br /><b>Axioms for the open set topology: 1st ed.</b><br /><ol><li>Every point is contained in an open set: $\forall x\in X, \exists O\in\Phi, x\in O$.</li><li>Closure under finite intersection: $\forall O_1, O_2\in \Phi, O_1\cap O_2\in\Phi$.</li><li>Closure under arbitrary union: $\forall (O_\lambda)\in\Phi^\mathcal{I}, \bigcup_\lambda O_\lambda\in\Phi$. </li></ol>(Exercise: show that these axioms are implied on the open sets by the neighbourhood topology axioms. This is important, because at least with the finite intersection axiom, we just chose the "simplest axiom" available to us, which was stronger than the actual axiom we required, which was $\forall x\in X, \forall (O_1, O_2 \in \Phi)\ni x, \exists O\in\Phi, x\in O\subseteq O_1\cap O_2$. Well, along with the arbitrary union axiom this implies finite intersection, anyway.)<br /><br />There's actually a further re-formulation one can make: <i>every point is in an open set</i> is, due to the arbitrary union axiom, equivalent to <b>the universe is an open set</b>. So our <b>final formulation of topology</b> is:<br /><br /><b>Axioms for the open set topology: 2nd ed.</b><br /><ol><li>$X\in\Phi$.</li><li>Closure under finite intersection: $\forall O_1, O_2\in \Phi, O_1\cap O_2\in\Phi$.</li><li>Closure under arbitrary union: $\forall (O_{\lambda\in\mathcal{I}})\in\Phi^\mathcal{I}, \bigcup_\lambda O_\lambda\in\Phi$. </li></ol>Recall that as the fifth axiom of the neighbourhood topology was <b>optional</b> -- $X\in\Phi$ is similarly optional -- without it, the topology would basically just be a topology over the union of all open sets. But you'd then have a topology with "invisible points" (or "topologically invisible points"), which is kinda stupid.<br /><br />Sometimes, $\varnothing\in\Phi$ is also added as an axiom, but this can be proven from the arbitrary union axiom, as the empty set is simply the empty union. You can see why this must be true, as indeed each point of the empty set vacuously has it as a neighbourhood.</div><br /><div class="twn-beg">We were able to show that a topology can be determined completely by its open sets -- but the sense in which the open sets form the structure of the topological space is quite different from seeing the limit operator as the structure of the topological space. I wonder if some nice analogy to groups or linear spaces exists. <b>Can a group be determined by its subgroups?</b> Open sets are basically <b>sub-(topological spaces)</b> -- aren't they? All limits within an open subspace are the same as in the original (you can see this rigorously when we define subspace topologies). See <a href="https://math.stackexchange.com/questions/3323761/is-a-finite-group-determined-by-its-subgroups"><b>my math stackexchange question</b></a>.</div><br /><div class="twn-furtherinsight">If you want to really appreciate the motivation for the axioms of a neighbourhood, you need to understand a neighbourhood as a notion of "wiggling", e.g. when we say "for all neighbourhoods", we mean "for even the slightest wiggle", and the reason this translates this way is that supersets are part of the filter.</div>abstract mathematicscontinuous functionfiltershomeomorphismmathematicsneighbourhoodsopen setstopologyFri, 16 Aug 2019 04:39:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-6645129710122664101Abhimanyu Pallavi Sudhir2019-08-16T04:39:00ZIs a (finite) group determined by its subgroups?
https://math.stackexchange.com/questions/3323761/is-a-finite-group-determined-by-its-subgroups
6<p><strong>Motivation</strong></p>
<p>I think of the "structure" of a toplogical space <span class="math-container">$X$</span> as being the limit operator on functions <span class="math-container">$I\to X$</span> where <span class="math-container">$I$</span> could be the natural numbers or another topological space -- in this sense, a topological homomorphism (continuous function) <span class="math-container">$f$</span> is a function that commutes with the limit operation <span class="math-container">$f(\lim x)=\lim f(x)$</span>, similar to how a group homomorphism commutes with group multiplication <span class="math-container">$f(\mathrm{mult}(x,y))=\mathrm{mult}(f(x),f(y))$</span> and a linear transformation commutes with linear combination.</p>
<p>Nonetheless, it can be shown that this structure can be determined uniquely by the set of open sets on <span class="math-container">$X$</span>. One may also understand these open sets to be the "sub-(topological spaces)" of <span class="math-container">$X$</span> as the topology of <span class="math-container">$X$</span> is inherited by them exactly (well, the closed sets are also a "dual" kind of sub-topological spaces). </p>
<p>Similarly, given a set <span class="math-container">$V$</span> and a list of subsets that we call "subspaces" (which would have to satisfy some properties), one can determine the vector space up to isomorphism (i.e. we can find its dimension).</p>
<hr>
<p>I wonder if something like this can be done with groups. Given a set <span class="math-container">$G$</span> and a list of subsets we call its "subgroups", can we determine the group up to isomorphism? At least for finite sets?</p>
<p>Example given the set <span class="math-container">$\{0, 1, 2, 3\}$</span>, we'd be given the following "subgroup structure" on it: <span class="math-container">$\{\{0\},\{0,2\},\{0,1,2,3\}\}$</span>, and the group being described is <span class="math-container">$C_4$</span>. The positions of 1 and 3 aren't determined, but the group is still determined to isomorphism.</p>group-theoryfinite-groupsThu, 15 Aug 2019 05:43:06 GMThttps://math.stackexchange.com/q/3323761Abhimanyu Pallavi Sudhir2019-08-15T05:43:06ZComment by Abhimanyu Pallavi Sudhir on How is it possible that consciousness-causes-collapse interpretations of QM are not falsified by the Quantum Zeno effect?
https://physics.stackexchange.com/questions/495125/how-is-it-possible-that-consciousness-causes-collapse-interpretations-of-qm-are/495137#495137
@Wolphramjonny Just write down the state vector for the combined system of the (not yet measured) "non-conscious" apparatus and the system being measured. This represents the "knowledge of the system according to an external observer". As you can see, metaphysical questions about the "knowledge of the apparatus" are not involved in the expression.Mon, 05 Aug 2019 07:46:28 GMThttps://physics.stackexchange.com/questions/495125/how-is-it-possible-that-consciousness-causes-collapse-interpretations-of-qm-are/495137?cid=1115461#495137Abhimanyu Pallavi Sudhir2019-08-05T07:46:28ZAnswer by Abhimanyu Pallavi Sudhir for How is it possible that consciousness-causes-collapse interpretations of QM are not falsified by the Quantum Zeno effect?
https://physics.stackexchange.com/questions/495125/how-is-it-possible-that-consciousness-causes-collapse-interpretations-of-qm-are/495137#495137
1<p>If you accept positivism, it becomes obvious that "consciousness causes collapse" cannot possibly be distinguished experimentally from the Copahangen principle as long as you accept that <em>you</em> are conscious. </p>
<p>This "interpretation" makes claims about the knowledge of <em>another</em> (non-conscious) observer, claiming that it does not alter the state of other systems. But this is fundamentally a metaphysical claim -- it's like asking "what if my red is your blue and my blue is your red?" Whatever your metaphysical belief on whether a non-conscious observer "already" caused a wavefunction collapse, your knowledge only changes when you observe the system, be it of that non-conscious observer.</p>Sun, 04 Aug 2019 05:52:21 GMThttps://physics.stackexchange.com/questions/495125/-/495137#495137Abhimanyu Pallavi Sudhir2019-08-04T05:52:21ZComment by Abhimanyu Pallavi Sudhir on How is it possible that consciousness-causes-collapse interpretations of QM are not falsified by the Quantum Zeno effect?
https://physics.stackexchange.com/questions/495125/how-is-it-possible-that-consciousness-causes-collapse-interpretations-of-qm-are/495127#495127
There is no testable definition for consciousness, but wavefunction collapse is certainly a well-defined idea -- it represents a collapse of the probability (amplitude) distribution representing your knowledge of a system.Sun, 04 Aug 2019 05:20:15 GMThttps://physics.stackexchange.com/questions/495125/how-is-it-possible-that-consciousness-causes-collapse-interpretations-of-qm-are/495127?cid=1115081#495127Abhimanyu Pallavi Sudhir2019-08-04T05:20:15ZOrthogonal group, indefinite orthogonal group, orthochronous stuff
https://thewindingnumber.blogspot.com/2019/08/orthogonal-group-indefinite-orthogonal.html
0This post appears in the <a href="https://thewindingnumber.blogspot.com/p/1103.html">Linear Algebra</a> and <a href="https://thewindingnumber.blogspot.com/p/2101.html">Special Relativity</a> courses.<br /><br />There are several ways to see that the matrices satisfying $A^*A=I$ are related to rotations in some way, other than just expanding out the components like a dumb pygmy chimp -- no, we are the normal chimp:<br /><ol><li>Write it as $A^TIA=I$ -- i.e. the set of matrices that <b>preserve the identity quadratic form</b>. The identity quadratic form corresponds to the $n$-sphere (e.g. a circle), so we're looking for transformations that preserve the $n$-sphere. A clearer way to see this is that preserving the quadratic form $I$ is equivalent to preserving the valuation $x^TIy$ for all $x, y$, i.e. $(Ax)^TI(Ay)=x^Ty$, so it preserves the value of each contour.</li><li>With the same logic as above, $(Ax)^T(Ay)=x^Ty$, i.e. the <b>preservation of the Euclidean dot product</b> means that all lengths and angles are preserved. These are called "rigid rotations", and are basically the kind of stuff we can do to a sheet of paper without compressing or stretching it in any way -- i.e. if we nudge a vector by a certain angle, every other vector should also be nudged by the same angle.</li></ol><div>What kind of transformations preserve the unit sphere? </div><br /><div class="twn-furtherinsight">The reason this is a good way of understanding things is that there are plenty of other such "dot products" you can define in mathematics, corresponding to <b>different geometries</b> -- each can be based on the bilinear form it preserves, see this <a href="https://thewindingnumber.blogspot.com/2019/04/geometry-positive-definiteness-and.html">later linear algebra article</a> for more details, relating to isomorphisms of such geometries etc.</div><br />As for discriminating between rotations and reflections, suppose we define rotations in a completely geometric way -- for a matrix to be a rotation, all its eigenvalues are either 1 or in pairs of unit complex conjugates.<br /><br />What do the eigenvalues of orthogonal matrices look like? For each eigenvalue, you need $\overline{\lambda}\lambda=1$, i.e. all the eigenvalues are unit complex numbers. If a complex eigenvalue isn't <b>paired with a corresponding conjugate</b>, you will not get a real-valued transformation on $\mathbb{R}^n$. Meanwhile if an eigenvalue of -1 isn't paired with another -1 -- i.e. if there are an <b>odd number of reflections</b> -- you get a reflection. In this sense, the "<b>conjugate eigenvalues</b>" property of rotations can be seen as a <b>generalisation of the "$s_1s_2=r$" property</b> which you may have learned from plane geometry or dihedral groups. The orthogonal (or rather unitary) transformations that do not behave this way are precisely the rotations.<br /><br />The similarity between unpaired unit complex eigenvalues and unpaired -1's is interesting, by the way -- when thinking about reflections, you might have gotten the idea that reflections are $\pi$-angle rotations in a higher-dimensional space -- like the vector was rotated through a higher-dimensional space and then landed on its reflection -- like it was a discrete snapshot of a process as smooth as any rotation.<br /><br />Well, <b>now you know what this higher-dimensional space is</b> -- precisely $\mathbb{C}^n$. And the determinant of a unitary matrix also takes a continuous spectrum -- the entire unit circle. In this sense (among other senses) complex linear algebra is <b>more "complete" than real linear algebra</b>. In fact, you will see in Lie theory that the group $SO(n)$ is connected but $O(n)$ is not, while $SU(n)$ and $U(n)$ are both connected. Can you see why?<br /><br />(<a href="https://math.stackexchange.com/questions/68119/why-does-ata-i-det-a-1-mean-a-is-a-rotation-matrix/3177807#3177807">original version of above originally posted to math stackexchange</a>)<br /><br />Well, here, we benefited from the fact that the product of two reflections is a rotation -- so we could just enforce the "even number of flips", i.e. that $\det A=1$, to specify rotations. But what if we're dealing with one of the "generalised geometries" we discussed? What if instead of preserving $I$, we wanted the group $O(m\mid n)$, i.e. that preserves some $\mathrm{diag}(m\mid n)$ with $m$ 1's and $n$ -1's along the diagonal?<br /><br />Well, then we don't have rotations between the "1"-labeled (spatial) axes and the "-1"-labeled (temporal) axes, only <b>boosts</b>. But compositions between such reflections form <i style="font-weight: bold;">rotations</i>! So simply restricting that $\det A = 1$ will -- while still forming a group $SO(m \mid n)$ -- retain all these rotations which can only be understood as compositions of reflections.<br /><br />So how do we extract the transformations we want? (What transformations <i>do</i> we want? The ones that correspond to changes of reference frame, in special relativity language -- well, in the sense of Lie theory, this means we're looking for the "<b>component connected to the identity</b>" -- do you see why?)<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://upload.wikimedia.org/wikipedia/commons/9/91/HyperboloidOfTwoSheets.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="456" data-original-width="406" height="320" src="https://upload.wikimedia.org/wikipedia/commons/9/91/HyperboloidOfTwoSheets.png" width="284" /></a></div><br />Let's think about this more clearly. Start by noting that not all reflections in spacetime preserve the Minkowski metric $\mathrm{diag}(m\mid n)$ -- only those that preserve the invariant hyperboloids. In the case of 3+1-spacetime, this means infinite spatial reflections and one time-reversal -- in the case of a general $m+n$-spacetime, this means infinite <b>spatial reflections</b> and infinite <b>temporal reflections</b> (in any $m+n-1$-plane whose normal vector is temporal, not to be confused with time-like). When you multiply an odd temporal reflection with an odd spatial reflection, you get an even time-space rotation, which is in $SO(3\mid 1)$. <br /><br />$$A = \left[ {\begin{array}{*{20}{c}}{{A_t}}&B^T\\C&{{A_s}}\end{array}} \right]$$<br />(Note on notation: we'll use ${A_T} = \left[ {\begin{array}{*{20}{c}}{{A_t}}&0\\0&I\end{array}} \right]$ and analogously ${A_S} = \left[ {\begin{array}{*{20}{c}}I&0\\0&{{A_s}}\end{array}} \right]$, where $A_t$ and $A_T$ are "basically the same thing", and analogously for $A_s$ and $A_S$ -- in particular $\det A_t=\det A_T$ and $\det A_s=\det A_S$.)<br /><br />We see the problem: instead of just mandating $\det A=1$, we must mandate that the <b>temporal minor</b> and the <b>spatial minor</b> of the matrix both have determinant 1, $\det A_t=\det A_s = 1$. But this isn't right -- if you have a boost, i.e. some mixing between the space and time co-ordinates, then $A\ne A_TA_S$ and the component determinants are multiplied by a Lorentz factor (even though still $\det A = 1$). So we mandate instead that $\det A_t>0$, $\det A_s>0$ (equivalently $\ge 1$). Such transformations are called the <b>proper orthochronous Lorentz transformations</b>, because in the context of special relativity they are proper Lorentz transformations that do not flip time:<br /><br />$$SO^{+}(3 \mid 1)=\{A\in O(3 \mid 1) \mid \det A_t >0, \det A_s >0\}$$<br />OK, how do we show $SO^{+}(m\mid n)$ is a subgroup? You might get the notion that because of the "<b>two sheets hyperbola</b>" topology of the group, the sheet connected to the identity must be a <b>subgroup</b> (and the other sheet a <b>coset</b>) because moving about on the sheet keeps you on the sheet (and that's what group multiplication is -- moving about on the sheet). The formal way to say this is to say that the map $A\mapsto \mathrm{sgn}(\det A_t )$ is a group <b>homomorphism</b> to the cyclic group $\{1,-1\}$, so its kernel is necessarily a <b>normal subgroup</b> (do you see how these are the same thing?).<br /><br />So the key is to prove that for two matrices satisfying $\mathrm{sgn}(\det A_t )>0$, their product does too. A proof of the $SO^+(m\mid 1)$ case (relevant for relativity) can be found <a href="https://physics.stackexchange.com/questions/36384/how-to-prove-that-orthochronous-lorentz-transformations-o1-3-form-a-group/36425#comment1111428_36425">here</a> -- I'm not sure how that proof can be appropriately generalised to $SO^+(m\mid n)$. I've written out the first few steps here:<br /><ol><li>Multiply the two matrices $A$ and $\tilde{A}$ to show $(A\tilde{A})_t=A_t\tilde{A}_t+B^T\tilde{C}$. We want to show the determinant of this is positive.</li><li>From multiplying out $A^T\eta A=\eta$ and $A\eta A^T=\eta$, we see that $A_t^2-C^TC=A_t^2-B^TB=I$ and analogous for $\tilde{A}$.</li><li>So $\det((A\tilde{A})_t-A_t\tilde{A}_t)=\det(B^T\tilde{C})=\sqrt{\det(A_t^2-I)\det(\tilde{A}_t^2-I)}$</li><li>Well, I'm not sure how to proceed at this point. Does $\det(X-PQ)=\det((P^2-I)(Q^2-I))^{1/2}$ imply that $\det P\ge1\land\det Q\ge1\Rightarrow \det X>0$?</li></ol>Well, I can't think of a way to continue -- and certainly one can think of a much wider category of problems like this, where we have a much simpler topological picture in our heads than rubbish algebra like the above would betray. So we need a <i>topological way of looking at Lie groups</i>.<br /><br />You might think of just considering something like the orbit of a vector -- e.g. the unit time vector -- under the group for the topology, but this <b>does not fully describe the topology of the group</b>. As an illustration, in the above example, for $n>1$, the orbit of the time vector under $O^+(m\mid n)$ is actually connected (prove this -- you need to count the number of sheets a general hyperbola has), while the entire topology of the group is actually disconnected, as we will see. A simple way to see that these are two different topologies is that spatial rotations/reflections leave the unit time vector unchanged and therefore all correspond to a single point on the orbit.<br /><br />This will be our starting point to motivate the study of the topology of a Lie group in the <a href="https://thewindingnumber.blogspot.com/p/1203.html">Lie theory</a> articles.lie groupslie theorylorentz grouplorentz transformationsorthogonal groupspecial relativitySat, 03 Aug 2019 14:27:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-9034342794956253801Abhimanyu Pallavi Sudhir2019-08-03T14:27:00ZComment by Abhimanyu Pallavi Sudhir on Was "Crook's algorithm" for Sudoku really only developed in the 21st century?
https://puzzling.stackexchange.com/questions/86805/was-crooks-algorithm-for-sudoku-really-only-developed-in-the-21st-century
@ArnaudMortier How so?Fri, 02 Aug 2019 16:54:03 GMThttps://puzzling.stackexchange.com/questions/86805/was-crooks-algorithm-for-sudoku-really-only-developed-in-the-21st-century?cid=252361Abhimanyu Pallavi Sudhir2019-08-02T16:54:03ZComment by Abhimanyu Pallavi Sudhir on Was "Crook's algorithm" for Sudoku really only developed in the 21st century?
https://puzzling.stackexchange.com/questions/86805/was-crooks-algorithm-for-sudoku-really-only-developed-in-the-21st-century
@GarethMcCaughan Note that (1) and (2) are special cases of (3) for K = 1, K = total number of empty squares - 1.Fri, 02 Aug 2019 16:53:28 GMThttps://puzzling.stackexchange.com/questions/86805/was-crooks-algorithm-for-sudoku-really-only-developed-in-the-21st-century?cid=252360Abhimanyu Pallavi Sudhir2019-08-02T16:53:28ZComment by Abhimanyu Pallavi Sudhir on Why would ReLU work as an activation function at all?
https://stats.stackexchange.com/questions/297947/why-would-relu-work-as-an-activation-function-at-all/298159#298159
Except the standard proof of the universal approximation theorem relies on the boundedness of the activation functions. There are <a href="https://arxiv.org/pdf/1505.03654.pdf" rel="nofollow noreferrer">extensions</a>, but the fact that ReLU works is not obvious to me.Fri, 02 Aug 2019 16:50:58 GMThttps://stats.stackexchange.com/questions/297947/why-would-relu-work-as-an-activation-function-at-all/298159?cid=784263#298159Abhimanyu Pallavi Sudhir2019-08-02T16:50:58ZWas "Crook's algorithm" for Sudoku really only developed in the 21st century?
https://puzzling.stackexchange.com/questions/86805/was-crooks-algorithm-for-sudoku-really-only-developed-in-the-21st-century
4<p>The following algorithm for simplifying (and very often completely solving) Sudoku puzzles:</p>
<ol>
<li>Label each cell with the set of all possible values it could take.</li>
<li>Pick a row/column/block and for a value of <span class="math-container">$K\in[1, 9)$</span>, look for "<span class="math-container">$K$</span>-partnerships" -- <span class="math-container">$K$</span>-tuples of cells that satisfy "the union of labels of each cell in the tuple has cardinality <span class="math-container">$K$</span>". Call the "union of labels of each cell in a partnership" the "banned set" of the partnership.</li>
<li>For each such partnership, for all cells in that row/column/block <em>not</em> in the partnership remove any element in its label that are in the banned set of the partnership.</li>
<li>Repeat Steps 2-3 for all values of <span class="math-container">$K$</span> and all rows, columns and blocks.</li>
</ol>
<p>(i.e. "if you have three cells labeled as (4, 5), (4, 7), (4, 5, 7), no other cell in that row can be 4, 5 or 7") </p>
<p>... has always seemed obvious to me, but I'm now informed from some sources that it has a name called "Crook's algorithm":</p>
<ul>
<li><a href="http://pi.math.cornell.edu/~mec/Summer2009/meerkamp/Site/Solving_any_Sudoku_II.html" rel="nofollow noreferrer">http://pi.math.cornell.edu/~mec/Summer2009/meerkamp/Site/Solving_any_Sudoku_II.html</a></li>
<li><a href="https://www.ams.org/notices/200904/tx090400460p.pdf" rel="nofollow noreferrer">https://www.ams.org/notices/200904/tx090400460p.pdf</a></li>
</ul>
<p>The latter (by Crook) attributes the algorithm to texts written in 2005 and 2006. Are these really the earliest references? I'm pretty sure this must have been well-known for decades, but I'm not sure what to search for to find older references.</p>sudokupuzzle-historyFri, 02 Aug 2019 07:02:08 GMThttps://puzzling.stackexchange.com/q/86805Abhimanyu Pallavi Sudhir2019-08-02T07:02:08ZComment by Abhimanyu Pallavi Sudhir on Is there go up line character? (Opposite of \n)
https://stackoverflow.com/questions/11474391/is-there-go-up-line-character-opposite-of-n/11474509#11474509
Doesn't work with Windows/Python 3.Thu, 01 Aug 2019 19:49:57 GMThttps://stackoverflow.com/questions/11474391/is-there-go-up-line-character-opposite-of-n/11474509?cid=101123890#11474509Abhimanyu Pallavi Sudhir2019-08-01T19:49:57ZComment by Abhimanyu Pallavi Sudhir on output to the same line overwriting previous output ? python (2.5)
https://stackoverflow.com/questions/4897359/output-to-the-same-line-overwriting-previous-output-python-2-5/27023394#27023394
Using <code>end = '\r'</code> instead fixes the problem in Python 3.Wed, 31 Jul 2019 19:10:36 GMThttps://stackoverflow.com/questions/4897359/output-to-the-same-line-overwriting-previous-output-python-2-5/27023394?cid=101088967#27023394Abhimanyu Pallavi Sudhir2019-07-31T19:10:36ZAnswer by Abhimanyu Pallavi Sudhir for Neural Networks vs. Polynomial Regression/Other techniques for curve fitting?
https://math.stackexchange.com/questions/2901209/neural-networks-vs-polynomial-regression-other-techniques-for-curve-fitting/3308606#3308606
0<p>Polynomial regression is just usually the wrong Bayesian prior. You need functions with highly "non-local" effects which require high-degree polynomials, but polynomial regression gives zero prior probabilities to high-degree polynomials. As it turns out, neural networks happen to provide a reasonably good prior (perhaps that's why our brains work that way -- if they even do).</p>Tue, 30 Jul 2019 18:31:31 GMThttps://math.stackexchange.com/questions/2901209/-/3308606#3308606Abhimanyu Pallavi Sudhir2019-07-30T18:31:31ZComment by Abhimanyu Pallavi Sudhir on How to prove that orthochronous Lorentz transformations $O^+(1,3)$ form a group?
https://physics.stackexchange.com/questions/36384/how-to-prove-that-orthochronous-lorentz-transformations-o1-3-form-a-group/36388#36388
This <a href="https://physics.stackexchange.com/questions/494260/orthochronous-lorentz-transformations-om-n-form-a-group?noredirect=1&lq=1">doesn't generalise</a> to the case of $O^+(m,n)$ where $n>1$, because the hyperboloid is then connected.Tue, 30 Jul 2019 07:59:10 GMThttps://physics.stackexchange.com/questions/36384/how-to-prove-that-orthochronous-lorentz-transformations-o1-3-form-a-group/36388?cid=1112922#36388Abhimanyu Pallavi Sudhir2019-07-30T07:59:10ZComment by Abhimanyu Pallavi Sudhir on How to prove that orthochronous Lorentz transformations $O^+(1,3)$ form a group?
https://physics.stackexchange.com/questions/36384/how-to-prove-that-orthochronous-lorentz-transformations-o1-3-form-a-group/36385#36385
Function composition is always associative. Non-associative objects arise when the product of two objects (that otherwise act as transformations) is not the composition of their corresponding transformations.Tue, 30 Jul 2019 07:34:41 GMThttps://physics.stackexchange.com/questions/36384/how-to-prove-that-orthochronous-lorentz-transformations-o1-3-form-a-group/36385?cid=1112918#36385Abhimanyu Pallavi Sudhir2019-07-30T07:34:41ZComment by Abhimanyu Pallavi Sudhir on Orthochronous indefinite orthogonal group $O^+(m, n)$ forms a group
https://physics.stackexchange.com/questions/494260/orthochronous-indefinite-orthogonal-group-om-n-forms-a-group
@G.Smith $\det(\Lambda_a)>0$ (and in fact $\ge 1$).Tue, 30 Jul 2019 05:17:23 GMThttps://physics.stackexchange.com/questions/494260/orthochronous-indefinite-orthogonal-group-om-n-forms-a-group?cid=1112885Abhimanyu Pallavi Sudhir2019-07-30T05:17:23ZOrthochronous indefinite orthogonal group $O^+(m, n)$ forms a group
https://physics.stackexchange.com/questions/494260/orthochronous-indefinite-orthogonal-group-om-n-forms-a-group
1<p>My question is based on Qmechanic's answer <a href="https://physics.stackexchange.com/a/36425/23119">here</a> which proves that <span class="math-container">$O^+(m, 1)$</span> forms a group -- that if two Lorentz transformations have positive time-time co-ordinate, so does their product. The key is that with the Lorentz transformation written in the form:</p>
<p><span class="math-container">$$\Lambda = \left[\begin{array}{cc}\Lambda_a & \Lambda_b^t \cr \Lambda_c &\Lambda_R \end{array} \right].$$</span></p>
<p>We can show that <span class="math-container">$|(\Lambda\tilde{\Lambda})_a-\Lambda_a\tilde{\Lambda}_a|\le \sqrt{(\Lambda_a^2-1)(\tilde{\Lambda}_a^2-1)}$</span> which implies that positive <span class="math-container">$\Lambda_a,\tilde{\Lambda_a}$</span> imply positive <span class="math-container">$(\Lambda\tilde{\Lambda})_a$</span>.</p>
<p>Well, the trouble is that this uses the Cauchy-Schwarz inequality in Step 6, and therefore doesn't work for the general case of <span class="math-container">$O^+(m, n)$</span>. How would one generalise the proof to <strong>prove the orthochronous indefinite orthogonal group <span class="math-container">$O^+(m, n)$</span> is a group</strong>?</p>
<p>Here's what I've tried so far: defining <span class="math-container">$O^{+}(m,n)$</span> as the subset of <span class="math-container">$O(m,n)$</span> with elements <span class="math-container">$\Lambda$</span> which satisfy <span class="math-container">$\det(\Lambda_a)>0$</span> (and in fact <span class="math-container">$\ge 1$</span>), </p>
<ol>
<li><p>As before, <span class="math-container">$(\Lambda\tilde{\Lambda})_a=\Lambda_a\tilde{\Lambda}_a+\Lambda_b^T\tilde{\Lambda}_c$</span>. </p></li>
<li><p>From multiplying out <span class="math-container">$\Lambda^T\eta \Lambda=\eta$</span> and <span class="math-container">$\Lambda\eta \Lambda^T=\eta$</span>, we see that <span class="math-container">$\Lambda_a^2-\Lambda_c^T\Lambda_c=\Lambda_a^2-\Lambda_b^T\Lambda_b=I$</span> and analogous for <span class="math-container">$\tilde{\Lambda}$</span>.</p></li>
<li><p>So <span class="math-container">$\det\left((\Lambda\tilde{\Lambda})_a-\Lambda_a\tilde{\Lambda}_a\right)=\det\left(\Lambda_b^T\tilde{\Lambda}_c^T\right)=\sqrt{\det\left(\Lambda_a^2-I\right)\det\left(\tilde{\Lambda}_a^2-I\right)}$</span>.</p></li>
</ol>
<p>Well, I'm not sure how to proceed at this point. Does <span class="math-container">$\det(X-PQ)=\det((P^2-I)(Q^2-I))^{1/2}$</span> imply that <span class="math-container">$\det P\ge 1\land\det Q\ge 1\Rightarrow \det X>0$</span> <em>in general</em>?</p>
<p>The <a href="https://physics.stackexchange.com/a/36388/23119">"topological proof" from Ron Maimon</a> does not work either, as the orbit of the unit time vector is <a href="https://math.stackexchange.com/questions/2022156/how-many-sheets-can-a-hyperboloid-have-in-n-dimensions">connected when <span class="math-container">$n>1$</span></a>. I suspect that a more powerful technique than looking at the orbit of the unit time vector would be to look at the topology of the Lie group itself -- but I'm not that familiar with this stuff.</p>special-relativitymathematical-physicsgroup-theorylorentz-symmetrytopologyMon, 29 Jul 2019 20:30:31 GMThttps://physics.stackexchange.com/q/494260Abhimanyu Pallavi Sudhir2019-07-29T20:30:31ZAnswer by Abhimanyu Pallavi Sudhir for What is the frequency of white light?
https://physics.stackexchange.com/questions/494081/what-is-the-frequency-of-white-light/494085#494085
1<p>It doesn't have a specific frequency -- it has a frequency distribution.</p>
<p>You don't even need to go as far as white light -- just consider a "camel hump" wave, like <span class="math-container">$\sin ax+\sin bx$</span> -- what's the frequency of a light wave that looks like this? The answer is that its frequency isn't a fixed value, but a distribution, taking values <span class="math-container">$a/2\pi$</span> and <span class="math-container">$b/2\pi$</span> with half probability each. In general, if you have some function <span class="math-container">$f(x)$</span>, the way to obtain this <strong>frequency distribution</strong> is to decompose <span class="math-container">$f(x)$</span> in terms of sinusoids -- this is precisely the <strong>Fourier transform</strong>.</p>
<p>In the specific case you mentioned, position and momentum ("frequency") are "Fourier duals" of each other. If you have a sinusoid (by which I mean <span class="math-container">$e^{2\pi i\xi x}$</span>), you have complete uncertainty about the position, but have a precise value for the momentum: <span class="math-container">$h\xi$</span>. On the other hand, if you had localised your position completely (to a Dirac delta function), you would find a sinusoid in momentum-space.</p>
<p>These distributions are called the "wavefunctions" in position and momentum basis respectively, and this duality is the "uncertainty principle" -- read more about this in my <a href="https://thewindingnumber.blogspot.com/p/2103.html" rel="nofollow noreferrer">quantum mechanics articles here</a> (specifically article 4). In the specific case of white light, white light isn't really a well-defined concept in physics -- it has to do with human eyesight and what visible light entails, but nonetheless the frequency of white light is indeed a distribution with non-zero variance.</p>Sun, 28 Jul 2019 18:23:58 GMThttps://physics.stackexchange.com/questions/494081/-/494085#494085Abhimanyu Pallavi Sudhir2019-07-28T18:23:58ZComment by Abhimanyu Pallavi Sudhir on How to prove that orthochronous Lorentz transformations $O^+(1,3)$ form a group?
https://physics.stackexchange.com/questions/36384/how-to-prove-that-orthochronous-lorentz-transformations-o1-3-form-a-group/36425#36425
Wait, how do you do the Cauchy-Schwarz step (Ex. 6) for the general case of the indefinite Orthogonal group?Fri, 26 Jul 2019 12:37:14 GMThttps://physics.stackexchange.com/questions/36384/how-to-prove-that-orthochronous-lorentz-transformations-o1-3-form-a-group/36425?cid=1111428#36425Abhimanyu Pallavi Sudhir2019-07-26T12:37:14ZHow does a covariance intensity function measure clustering?
https://stats.stackexchange.com/questions/418046/how-does-a-covariance-intensity-function-measure-clustering
0<p>I was taught in a class on spatial statistics that the covariance intensity function (defined below) measured clustering and inhibition in a point process, but isn't used because good test statistics for it don't exist.</p>
<p><span class="math-container">$$c(\mathbf{x},\mathbf{y})=\lim_{|d\mathbf{x}|\to0,|d\mathbf{y}|\to 0} \frac{\mathrm{Cov}\{N(d\mathbf{x}), N(d\mathbf{y})\}}{|d\mathbf{x}||d\mathbf{y}|}$$</span></p>
<p>Where <span class="math-container">$\mathbf{x}, \mathbf{y}$</span> are positions on the domain and <span class="math-container">$d\mathbf{x}, d\mathbf{y}$</span> are regions around them with areas given by <span class="math-container">$|d\mathbf{x}|, |d\mathbf{y}|$</span>, and <span class="math-container">$N(R)$</span> represents the random variable corresponding to the number of events in a region.</p>
<p>But I can't see how this measures non-homogeneity at all -- if one starts with a process that is described by an intensity function -- any intensity function -- this function should necessarily be zero, as the existence of an intensity function means a point turning up at point <span class="math-container">$\mathbf{x}$</span> is independent of a point turning up at intensity <span class="math-container">$\mathbf{y}$</span>. And you can certainly have intensity functions that exhibit clustering.</p>
<p>The only way that this function can be non-zero as I see it is if there are correlations within a realisation, e.g. if "everything is clustered to one side" and "everything is clustered to the other side" are the possibilities, or something i.e. if you don't have an intensity function at all, but rather some sort of "entangled state".</p>
<p>What am I missing?</p>correlationcovariancespatialpoint-processThu, 18 Jul 2019 10:50:54 GMThttps://stats.stackexchange.com/q/418046Abhimanyu Pallavi Sudhir2019-07-18T10:50:54ZAnswer by Abhimanyu Pallavi Sudhir for How does non-commutativity lead to uncertainty?
https://physics.stackexchange.com/questions/10362/how-does-non-commutativity-lead-to-uncertainty/491378#491378
0<p>When first learning about wavefunction collapse, I was surprised by the idea that the wavefunction would just <em>become</em> an eigenstate of the observable -- losing all other components of the state vector. Well, it's not as bad as you'd first expect, because the Hilbert space is really big. </p>
<p>But if two operators <em>do not have a common eigenbasis</em> -- i.e. if they don't commute, you do "lose information" about one observable when measuring the other one. This is precisely what the uncertainty principle codifies.</p>Sat, 13 Jul 2019 10:56:00 GMThttps://physics.stackexchange.com/questions/10362/-/491378#491378Abhimanyu Pallavi Sudhir2019-07-13T10:56:00ZMixed states I: density matrix, partial trace, the most general Born rule
https://thewindingnumber.blogspot.com/2019/07/mixed-states-i-properties-of-density.html
0In the <a href="https://thewindingnumber.blogspot.com/2019/07/systems-and-sub-systems-tensor-product.html">last article</a>, we saw that sub-systems entangled with other sub-systems did <b>not have well-defined pure states</b> themselves -- just like correlated random variables don't have their own probability distributions. Since pretty much everything you see in the real world is entangled with <i>something</i> -- has correlations with some other thing -- this is a problem. One can't just consider the "state of the entire universe" when you just want to study a single electron or <i>something</i>.<br /><br />Wait -- why can't we just consider the <b>marginal distributions</b>, like we do in statistics? OK, suppose we start with the system -- with $|\phi\rangle$ and $|\varphi\rangle$ an orthonormal basis:<br /><br />$$|\psi\rangle = \frac1{\sqrt2}|\phi\rangle\otimes|\varphi\rangle+\frac1{\sqrt2}|\varphi\rangle\otimes|\phi\rangle$$<br />Naively, you may think that the state of the first sub-system $|\psi_1\rangle$ may be given by $|\psi'_1\rangle=\frac1{\sqrt2}|\phi\rangle+\frac1{\sqrt2}|\varphi\rangle$. Certainly, if we're measuring the subsystem with an operator with eigenvalues $|\phi\rangle$ and $|\varphi\rangle$, you have 50% probabilities of each. But to say that two things are in the same state requires that they produce the same outcome for <b>any</b> measurement, not just that one. Does our sub-system behave exactly like $|\psi'_1\rangle$ <b>for all observables</b>? Recall that in the last article, we showed that collapsing the first sub-system onto $|\chi\rangle$ collapses the entire system into the state:<br /><br />$$|\chi\rangle\otimes\left(<br />\langle\chi|\varphi\rangle|\phi\rangle+\langle\chi|\phi\rangle|\varphi\rangle\right)$$<br />To calculate the probability amplitude of this collapse, we may take the inner product of this with the original state -- you can compute this, and see the answer comes down to $1/\sqrt2$, i.e. there's a <b>probability of $1/2$ of the first subsystem collapsing to <i>any</i> such eigenstate $|\chi\rangle$</b>. You can use <i>any</i> observable in this two-dimensional state space, and the sub-system would collapse into <b>either eigenstate with probability exactly $1/2$</b>.<br /><br />This is a <i>completely different situation</i> from if the state of the first subsystem were simply a <b>pure state</b> like $|\psi'_1\rangle$.<br /><br />The situation we're dealing with is called a <b>mixed state</b> -- an example of a mixed state, in line with the motivating examples we had at the beginning of the course -- is <b>unpolarised light</b>. In fact, the state we described above models precisely <b>unpolarised light involving two photons</b> (is it obvious why?).<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://upload.wikimedia.org/wikipedia/commons/thumb/f/f7/Circular_Polarizer_Creating_Clockwise_circularly_polarized_light.svg/1086px-Circular_Polarizer_Creating_Clockwise_circularly_polarized_light.svg.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="393" data-original-width="800" height="157" src="https://upload.wikimedia.org/wikipedia/commons/thumb/f/f7/Circular_Polarizer_Creating_Clockwise_circularly_polarized_light.svg/1086px-Circular_Polarizer_Creating_Clockwise_circularly_polarized_light.svg.png" width="320" /></a></div><br />The basic idea behind mixed states is that we have some uncertainty as to <i>what the state of a particle is</i> -- we don't know if the particle has state $|\phi\rangle$ or $|\varphi\rangle$ -- it has a 50% chance of either. This is a classical probability, rather than a quantum one, and different from the state being a superposition of these states, as we just saw above.<br /><br /><div class="twn-furtherinsight">Does this sort of thing occur with multivariate distributions in statistics? Suppose we have a multivariate distribution $\psi(x,y)$ and extract the marginal $x$-distribution $\phi(x) = \int_y \psi(x,y) dy$. Certainly this $\phi(x)$ gives us the right probability densities of each $x$-value. But the analog of considering general states like $|\chi\rangle$ is to make a <b>transformation of the domain</b> -- like a Fourier transform -- and consider probability densities in the transformed domain.<br /><br />As an exercise, write down a multivariate Fourier transform expression for $\hat{\psi}(\omega_1,\omega_2)$ and use it to compute the $\omega_1$-marginal probabilities $\hat{\phi}(\omega_1)$ -- compare this to what you would get if you were to Fourier-transform the $x$-marginal $\phi(x)$ directly.</div><br /><hr /><br />But saying "it has 1/2 probability of being in $|\phi\rangle$ and 1/2 probability of being in $|\varphi\rangle$" is clearly an <b>overdetermination</b>. As we saw above, this resulting state has a 1/2 probability of collapsing onto any state -- this is a statement that doesn't depend on $|\phi\rangle$ and $|\varphi\rangle$, the behaviour of the state is the same if you describe it instead as having "1/2 probability of being in $\frac1{\sqrt2}(|\phi\rangle+|\varphi\rangle)$ and 1/2 probability of being in $\frac1{\sqrt2}(|\phi\rangle-|\varphi\rangle)$". These two are <b>different statistical ensembles</b> but in the <b>same mixed state</b>.<br /><br />You can see similarly that "50% left-polarised + 50% right-polarised" is the same mixed state as "50% left-circular + 50% right-circular" -- they're both just <i>unpolarised light</i>.<br /><br /><b>What's the general condition for two statistical ensembles to produce the same observations?</b><br /><b><br /></b> Given statistical ensemble $\left(\left(p_i,|\psi_i\rangle\right)\right)$ and $\left(\left(p_i,|\psi'_i\rangle\right)\right)$, they are the same mixed state if for all $|\chi\rangle$, the probabilities of collapsing onto $|\chi\rangle$ is the same, i.e.<br /><br />$$\sum_i p_i|\langle\psi_i|\chi\rangle|^2=\sum_i p_i|\langle\psi'_i|\chi\rangle|^2$$<br />Well, each side of this equation is just the evaluation of a quadratic form for the vector $|\chi\rangle$ -- and <b>two quadratic forms are identically equal on all vectors if and only if their matrix representations are the same</b>. Well, what's the matrix representation? In the basis of $|\psi_i\rangle$, it's just the matrix of probabilities $p_i$. The way to write this in Bra-ket notation is to factor out the $|\chi\rangle$s:<br /><br />$$\left\langle \chi \right|\left( {\sum\limits_i {{p_i}\left| {{\psi _i}} \right\rangle \left\langle {{\psi _i}} \right|} } \right)\left| \chi \right\rangle = \left\langle \chi \right|\left( {\sum\limits_i {{p_i}\left| {{{\psi '}_i}} \right\rangle \left\langle {{{\psi '}_i}} \right|} } \right)\left| \chi \right\rangle<br />$$<br />This quadratic form in between, representing a mixed state, is called the <b>density matrix</b> and can be used to completely specify mixed states. In this sense, it is a <b>generalisation of the state vector</b>, which can only be used to represent pure states.<br /><br />$$\rho={\sum\limits_i {{p_i}\left| {{\psi _i}} \right\rangle \left\langle {{\psi _i}} \right|} }<br />$$<br />You may confirm that indeed:<br /><br />$$\frac12|\phi\rangle\langle\phi|+\frac12|\varphi\rangle\langle\varphi|=\frac12\left(\frac{|\phi\rangle+|\varphi\rangle}{\sqrt2}\frac{\langle\phi|+\langle\varphi|}{\sqrt2}\right)+\frac12\left(\frac{|\phi\rangle-|\varphi\rangle}{\sqrt2}\frac{\langle\phi|-\langle\varphi|}{\sqrt2}\right)$$<br />In fact, there is a simpler way to see that those two ensembles are the same: the density matrix is simply the <b>Gram matrix of the ensemble</b> -- you take the states in the ensemble, weighted by $\sqrt{p_i}$ in a matrix $Y$, and $\rho=Y^*Y$. Well, $Y^*Y=Y'^*Y' \iff Y'=UY$ for some unitary $Y$, i.e. the ensembles are rotations of each other.<br /><br /><hr /><br /><b>Properties of the density matrix, generalised Born's rule, etc.</b><br /><br />Here's something that's obvious: the density matrix is <b>nonnegative-definite ("positive-semidefinite") Hermitian and unit-trace</b> -- and all such matrices can represent density matrices.<br /><br />Well, so it's a Hermitian operator -- does it represent any interesting observable? Not really. It's an observable, sure, but not an interesting one (you might say it measures something's being in one of the ensemble states -- written in an orthonormal basis -- and whose eigenvalues are the mixing ratios, etc. -- but what if two mixing ratios are the same? Its behaviour is just bizarre and useless, really).<br /><br />We saw earlier that the probability of a density matrix collapsing into a state $|\chi\rangle$ is given by $\langle\chi|\rho|\chi\rangle$.<br /><br /><div class="twn-pitfall">This is <b>completely different</b> from the generalised Born's rule we saw earlier which took the form $\langle\psi|L|\psi\rangle$! There, the <em>state</em> was the vector and the information on the projection space was the quadratic form in between. Here, the state is the quadratic form in between while the state being projected onto is the vector. This is just a generalisation of the simple Born rule $\langle\chi|\psi\rangle\langle\psi|\chi\rangle$, as far as I can see. If anyone comes up with a connection between it and the generalised Born rule for pure states, tell me.</div><br />This brings the question, though -- what's the <i>most generalised Born's rule</i> we can come up with? What is the probability of a <b>mixed state collapsing into some eigenspace of a Hermitian projection operator</b>?<br /><br />Well, given the ensemble $((p_i,|\psi_i\rangle))$ (you can start writing ensembles with their density matrices now if you like, like $\sum p_i|\psi_i\rangle\langle\psi_i|$ -- but I just want to reaffirm that our result will indeed be in terms of the density matrix), the probability is:<br /><br />$$\sum_i p_i\langle\psi_i|L|\psi_i\rangle$$<br />This is hardly useful -- it's not in terms of the density matrix at all. But look at each term -- what's $p_i\langle\psi_i|L|\psi_i\rangle$? $L|\psi_i\rangle$ is the $i$th column of $L$ in the $(|\psi_i\rangle)$-basis -- the inner product $\langle\psi_i|L|\psi_i\rangle$ is the $i$th entry of this column. Multiplying this by $p_i$ gives us the <i>dot product of the $i$th row of $\rho$ with the $i$th column of $L$</i>. The <i>sum</i> of these for all $i$ gives us the trace of $\rho L$:<br /><br />$$\ldots = \mathrm{tr}(L\rho)$$<br />This is the <b>most general form of Born's rule</b>. Note that our derivation could have also applied to finding the <b>expectation value</b> of a general operator $A$ under the density matrix $\rho$ (recall that Hermitian projection operators are basically "indicator variables" whose expectation values represent probabilities), indeed generally:<br /><br />$$\langle A\rangle_\rho=\mathrm{tr}(L\rho)$$<br />(note that $\mathrm{tr}(V)=\sum_{i} \langle i | V|i \rangle $ for any basis $(|i\rangle)$, which you should show.)<br /><br />It is also trivial to show that upon collapse given by Hermitian projection operator $L$, the density matrix <b>collapses</b> to:<br /><br />$$\rho'=\frac1{\mathrm{tr}\left(L\rho L\right)}{L\rho L}=\frac1{\mathrm{tr}\left(L\rho\right)}{L\rho L}$$<br />Generalising the pure state collapse to $|\psi'\rangle=\frac{1}{\langle \psi | L | \psi\rangle}L|\psi\rangle$. One may check that the above expression reduces to $|\chi\rangle\langle\chi|$ in the case where $L=|\chi\rangle\langle \chi|$.<br /><br /><hr /><br /><b>Partial trace, trace</b><br /><b><br /></b> We started our discussion considering the pure state $\frac1{\sqrt2}|\phi\rangle\otimes|\varphi\rangle+\frac1{\sqrt2}|\varphi\rangle\otimes|\phi\rangle$ and asking for the mixed state of the first sub-system. We computed the inner product of this state with its projection under the operator $|\chi\rangle\langle\chi|\otimes1$ -- this tells us the evaluation of the quadratic form $\langle\chi|\rho_A|\chi\rangle$ at all vectors $|\chi\rangle$, which determines the quadratic form $\rho_A$ of the first state.<br /><br />So what exactly did we do -- in general? Starting with a density matrix $\rho$ on $H_1\otimes H_2$, we compute the probability of the first sub-system appearing in state $|\chi\rangle$: it's $\mathrm{tr}((|\chi\rangle\langle\chi|\otimes 1)\rho)$. So we try to find a density matrix $\rho_1$ satisfying, for all states $|\chi\rangle$:<br /><br />$$\mathrm{tr}((|\chi\rangle\langle\chi|\otimes 1)\rho)=\mathrm{tr}(|\chi\rangle\langle\chi|\rho_1)$$<br /><br /><b>Exercise: </b>Let $V$ be an operator on $H_1\otimes H_2$. We define its <b>partial trace</b> on $H_2$ as $\mathrm{tr}_2(V)=\sum_{j}\langle j|V|j\rangle $ for basis $(|j\rangle)$ of $H_2$ (where the inner product is done by extending operators by tensoring them with the identity). Show that the density matrix $\rho_1$ is given by:<br /><br />$$\rho_1=\mathrm{tr}_2(\rho)$$<br />I.e. show that for operators of the form $A\otimes I$: $\mathrm{tr}[(A\otimes I)\rho]=\mathrm{tr}_1(A\,\mathrm{tr}_2\rho)$.born's ruledensity matrixgram matrixlinear algebramixed statespartial tracequantum mechanicstraceunpolarised lightMon, 08 Jul 2019 14:26:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-6928695063429241689Abhimanyu Pallavi Sudhir2019-07-08T14:26:00ZSystems and sub-systems, the tensor product, quantum entanglement
https://thewindingnumber.blogspot.com/2019/07/systems-and-sub-systems-tensor-product.html
0What if we want to describe probabilities relating to multiple objects -- a system of objects?<br /><br />You might think it's sufficient to just write down the state vector of each individual object -- but this doesn't really tell us the entire picture. Suppose for example we're considering the state of Schrodinger's cat, less popularly known as <b>Schrodinger's cuckoo</b>, where the box contains TNT and the cuckoo bird. The state of the TNT is $|\mathrm{unexploded}\rangle+|\mathrm{exploded}\rangle$ and the state of the cuckoo is $|\mathrm{alive}\rangle+|\mathrm{dead}\rangle$ -- right?<br /><br /><div class="twn-pitfall">Never trust a cuckoo.</div><br /><div class="twn-furtherinsight">If I'm in charge of the box, the state of the cuckoo will definitely be $|\mathrm{dead}\rangle$ even if the state of the TNT is $|\mathrm{unexploded}\rangle$</div><br />Not exactly. There is a <b>correlation</b> between the state of the TNT and the state of the cuckoo that goes missing here -- the cuckoo is dead if and only if the TNT has exploded, and the cuckoo is alive if and only if the TNT is unexploded. In fact, defining the state vectors separately doesn't really make any sense -- we're just assigning the coefficients based on overall probabilities, and as we will see, this is really mixing up quantum and classical probabilities in a certain way whereas the state vector is supposed to only show quantum probabilities.<br /><br />Well, this is really the same question as what we do when we have multiple correlated random variables in statistics -- we define a <b>"joint" probability function</b> on a <b>"joint" phase space</b> that is the <b>Cartesian product of the original phase spaces</b>.<br /><br />You may be inclined to claim that similarly, our new Hilbert space should be the Cartesian product of the original Hilbert spaces. But Hilbert spaces are different in a fundamental sense from these phase spaces -- every point on the classical phase space is an independent state in the Hilbert space -- and general <b>vectors are <i>distributions</i></b> on the classical phase space. So like how the <em>cardinalities</em> of the classical phases are multiplied, the <b><em>dimensions</em> of the Hilbert spaces are multiplied</b>.<br /><br /><div class="twn-furtherinsight">The reason that we sometimes draw an analogy between the classical state space and the quantum state space is that the state vectors are really the "real objects" in quantum mechanics, and the Hilbert space shows the possible configurations of the state in this sense.</div><br />The product we want of Hilbert spaces -- which is <i>not</i> the Cartesian product -- is called a <b>tensor product of linear spaces</b> -- given an orthogonal basis $(|\phi_1\rangle,|\phi_2\rangle,\ldots)$ for the first Hilbert space and $(|\psi_1\rangle,|\psi_2\rangle,\ldots)$ for the second, their tensor product is spanned by new vectors which we denote as<br /><br />$$\left( {\begin{array}{*{20}{c}}{|{\phi_1}\rangle \boxtimes |{\psi_1}\rangle ,|{\phi_1}\rangle \boxtimes |{\psi_2}\rangle ,...,}\\{|{\phi_2}\rangle \boxtimes |{\psi_1}\rangle ,|{\phi_2}\rangle \boxtimes |{\psi_2}\rangle ,...,}\\ \vdots \end{array}} \right)$$<br /><div class="twn-pitfall">We're using $\boxtimes$ instead of $\otimes$ in the above enumeration of the basis, because we haven't yet defined the tensor product of states. The idea is that $|\phi_i\rangle\boxtimes|\psi_j\rangle$ are just placeholders, and we will shortly state that they are/can be the tensor product $|\phi_i\rangle\otimes|\psi_j\rangle$, which we will define now.</div><br />Certainly, this can represent any possible state in which the combined system of two objects can be in. What we need is a way to express the state of a combined system of two independent things in this "larger" Hilbert space -- i.e. a <b>map</b> from $H_1\times H_2\to H_1\otimes H_2$ that takes the (pure) states of two independent objects in $H_1$ and $H_2$ and outputs their state as a combined system in $H_1\otimes H_2$ -- we will call this product the <b>tensor product of vectors</b>, and denote it by the same symbol $\otimes$.<br /><br />OK, so what's the map? Certainly, $|\phi_i\rangle\otimes|\psi_j\rangle$ must form an <b>orthogonal basis</b> for $H_1\otimes H_2$ (why? think about this for a while -- they're clearly <b>orthogonal, as they are mutually exclusive</b> -- you can't be in "$|\phi_i\rangle$ and $|\psi_j\rangle$" and "$|\phi_{i'}\rangle$ and $|\psi_{j'}\rangle$" unless $(i,j)=(i',j')$; <b>spanning is proven similarly</b>, as considering the $|\phi_i\rangle$s and $|\psi_j\rangle$s as eigenstates of some operators $X$ and $Y$ on $H_1$ and $H_2$, then if one performs the operation of "observing $X$ and $Y$" -- and we can do this because the objects are independent -- then because the objects must be found in one of $|\phi_i\rangle$ and one of $|\psi_j\rangle$, the system must be found in one of $|\phi_i\rangle\otimes|\psi_j\rangle$ -- thus its original state was a linear combination of such states).<br /><br />OK, so<br /><br />$$<br />(p_1|\phi_1\rangle+p_2|\phi_2\rangle+\ldots)\otimes(q_1|\psi_1\rangle+q_2|\psi_2\rangle+\ldots)\\<br />\begin{align}<br />=\ & r_{11}|\phi_1\rangle\otimes|\psi_1\rangle + r_{12}|\phi_1\rangle\otimes|\psi_2\rangle+\ldots+\\<br />&r_{21}|\phi_2\rangle\otimes|\psi_1\rangle + r_{22}|\phi_2\rangle\otimes|\psi_2\rangle+\ldots+\\<br />& \vdots<br />\end{align}$$<br />What are the coefficients $r_{ij}$?<br /><br />Well, it's fairly obvious that $|r_{ij}|^2=|p_{i}|^2|q_{j}|^2$ -- that the <b><i>probabilities</i> are multiplicative</b>, this is tautological given what we want our product to represent -- the probability that the system is found in the state $|\phi_i\rangle\otimes|\psi_j\rangle$ is the probability that the objects are found in states $|\phi_i\rangle$ and $|\psi_j\rangle$, which is the product of the respective probabilities, as they are independent objects.<br /><br />Is it also true that the <b>probability amplitudes are multiplicative</b>, i.e. $r_{ij}=p_iq_j$?<br /><br />This may seem hard to prove, but the idea is quite simple: suppose we observe the state with the observables $X$ and $Y$, and find it in the state $|\phi_i\rangle\otimes|\psi_j\rangle$. Well, then if $r_{ij}=u_{ij}|r_{ij}|$ for some unit complex number $u_{ij}$, then from the right-hand-side, we must have collapsed to $u_{ij}|\phi_i\rangle\otimes|\psi_j\rangle$. So we must have $u_{ij}=1$.<br /><br />So indeed the product we're looking for is exactly the <b>tensor product from tensor algebra</b>.<br /><br /><div class="twn-furtherinsight">Here's a thing worth noting -- we've been referring to "systems" and "objects" as if they are somehow completely distinct things. But are they? The cat's state is itself a <b>tensor product</b> of a massive number of different states belonging to each <b>elementary particle</b> in its body, and lives already in a massive Hilbert space, because the "object" is itself a system. We will use the term <b>subsystem</b> instead of <b>object</b> from now.</div><br /><hr /><br />Alright: so we now know that elements of the tensored Hilbert space are all states, and only the ones that are <b>factorable</b> into an element of $H_1$ and $H_2$ represent subsystems that are independent. This is precisely how only <b>factorable probability mass/density functions</b> represent independent variables in statistics. Otherwise the variables are correlated -- not necessarily linearly correlated, but correlated.<br /><br />Such correlations can, of course, exist in our quantum mechanical theory, too -- like the cuckoo-TNT system we mentioned earlier. These are called <b>quantum correlations</b> or <b>quantum entanglement</b>.<br /><br />Why the fancy name? Because its consequences may seem superficially kinda "surprising". It's also a demonstration of quantum mechanics being different from classical mechanics, because without entangled states, the dimension of $H_1\otimes H_2$ would indeed be the sum of those of $H_1$ and $H_2$ rather than their product, like with phase spaces in classical mechanics.<br /><br />OK, what kind of surprising consequences?<br /><br />They're basically all of the following nature: suppose we have a state given by:<br /><br />$$\frac1{\sqrt2} (|\phi\rangle\otimes|\psi\rangle+|\psi\rangle\otimes|\phi\rangle)$$<br />I.e. two entangled particles where we know that they are in <b>two distinct states, but we don't know which is which</b>. Such a state can certainly be produced -- how? Just put two identical independent particles in a box then do a <b>"partial" measurement</b> -- a "<b>peek</b>" -- (which can be achieved, e.g. by some logic gates) that checks if they're in the same state or not, and uncovers no other information.<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://1.bp.blogspot.com/-RdQ446MRCZ0/XR-4FMLPe8I/AAAAAAAAFnI/Wdq0C5sq0csOQ7oa3N_WDTclzTPPFqMNACLcBGAs/s1600/alibobeshwar-1.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="472" data-original-width="1144" height="132" src="https://1.bp.blogspot.com/-RdQ446MRCZ0/XR-4FMLPe8I/AAAAAAAAFnI/Wdq0C5sq0csOQ7oa3N_WDTclzTPPFqMNACLcBGAs/s320/alibobeshwar-1.png" width="320" /></a></div><br />Now separate the particles spatially -- there's nothing wrong with this, they're still a system, which still has a state -- and give one to Alice and the other to Bob. Now if Bob looks at his particle and sees it in $|\psi\rangle$, he immediately <i><b>knows</b></i> that Alice could only observe her particle to be in state $|\phi\rangle$ -- <b>there's nothing Alice can do to change this outcome</b>.<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://1.bp.blogspot.com/-IgUTjPAGjsk/XR-4JqV0FJI/AAAAAAAAFnM/K5TDzVtWn_Qco4PtrHfMEkryqkq5iVJMQCLcBGAs/s1600/alibobeshwar-2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="540" data-original-width="1144" height="151" src="https://1.bp.blogspot.com/-IgUTjPAGjsk/XR-4JqV0FJI/AAAAAAAAFnM/K5TDzVtWn_Qco4PtrHfMEkryqkq5iVJMQCLcBGAs/s320/alibobeshwar-2.png" width="320" /></a></div><br />(you may worry that spatially separating the particles alters the state in some important way -- but it doesn't: the states $|\phi\rangle$ and $|\psi\rangle$ are both transformed individually that doesn't change the entangled structure of the combined state -- make sure this makes sense to you. But if it makes you happy, you could imagine the particles were already spatially separated when they were first entangled.)<br /><br />OK, perhaps you don't find this particularly surprising or unintuitive -- I don't either. But perhaps you do -- perhaps you think there's a violation of locality -- and the reason you do is because you haven't yet fully accepted <a href="https://thewindingnumber.blogspot.com/2017/08/three-domains-of-knowledge.html">logical positivism</a>. Let's consider what locality entails for each observer in the set-up, and see if it's violated:<br /><ul><li><b>Alice:</b> From Alice's perspective, Bob opening his box is just another way to observe her particle -- or rather, she can <b>observe Bob's brain</b> that contains the information, which collapses the state from her perspective. But this is <b>perfectly local</b> -- it takes time for information to propagate from Bob to her. Alternatively, if she doesn't observe Bob's brain and <b>just observes her own box</b> later, that's when her state collapses to $|\phi\rangle$, and she then learns that Bob had collapsed his state into $|\psi\rangle$ -- but as Bob <b>cannot choose what his state vector collapses</b> to, so he can't send her any information through entanglement. Even if there were a large number of entangled systems this way, the distribution of the states Alice can observe is the same whether or not Bob has collapsed his states (you can confirm this -- this is an idea called the <b>no-communication theorem</b> which we will discuss later in more mathematical detail).</li><li><b>Bob: </b>Certainly, Bob acquires knowledge of something far away, but no information actually propagated from Alice to him -- he just observed his own box.</li><li><b>another observer: </b>Charlie, who stands somewhere between Alice and Bob, too takes time to observe Bob's brain.</li></ul><div>So there really isn't a violation of locality. This isn't surprising at all -- certainly one could have <b>classical correlations</b> too. You could just juggle two distinct particles in a box and give them to each person, and Bob discovering his particle allows him to determine Alice's particle. </div><div><br /></div><div>The difference between the classical case and the quantum case is that in the classical case you could <i>pretend</i> that there's some <b>hidden truth</b> that is just not known to the observers. Quantum mechanics forbids any such hidden truth (as confirmed by commutator relations), and forces you to accept logical positivism, and there cannot be a "universal observer" as such a notion is inherently non-local. But the fact that correlation isn't non-local doesn't depend on whether you have <b>metaphysical notions of hidden truths</b> in classical physics -- it is a physical question, and is the <b>same in the classical and quantum cases</b>.<br /><br /><hr /><br />Are we done writing down our algebra of tensor products? We still haven't discussed how <b>inner products</b> and <b>projections </b>of tensor products behave. The basic question is "how do we <b>upgrade/combine operators</b> from $H_1$ and $H_2$ to $H_1\otimes H_2$? Let's start with the simple case of a factorable state in the form $|\phi\rangle\otimes|\varphi\rangle$. Suppose we apply a projection operator $X$ on the first particle. Have we made any observation on the second state? No -- just an identity projection. Or we could make an observation, a projection $Y$. So we can say that for the combined observation $X\otimes Y$,<br /><br />$$(X\otimes Y)(|\phi\rangle\otimes|\varphi\rangle)=(X|\phi\rangle)\otimes(Y|\varphi\rangle)$$<br />And an upgrade from $H_1$ to $H_1\otimes H_2$ is just tensoring with the identity $X\otimes 1$.<br /><br />But the full range of operators on $H_1\otimes H_2$ is a lot more complicated. We could consider <b>entangled states</b>. We could consider operators that are entangled (<b>"partial measurement" operators</b> like we described -- think about what these are). How would measurements on linear combinations of states look like (we know they <i>should</i> apply linearly, but let's show that)?<br /><br />Suppose we have a state in the form $\frac1{\sqrt2}|\phi\rangle\otimes|\varphi\rangle+\frac1{\sqrt2}|\varphi\rangle\otimes|\phi\rangle$. What exactly is this? We had two independent subsystems each in state $\frac1{\sqrt2}|\phi\rangle+\frac1{\sqrt2}|\varphi\rangle$, then we made an observation that showed they were in two distinct states -- we don't know which is in which. Now we make an observation and collapse the first subsystem to $|\chi\rangle$.<br /><br /><b>How does this alter the state of sub-system 2?</b><br /><br />OK, so "was" (quotation marks! quotation marks!) the system in $|\phi\rangle\otimes|\varphi\rangle$ or $|\varphi\rangle\otimes|\phi\rangle$? This is a question for <b>Bayes' theorem</b>.<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://1.bp.blogspot.com/-pf_gBpB1fu4/XSGtBgjdpoI/AAAAAAAAFns/zgumWWJQ2k8MIPnK3NiRd72_5fCIZDSGwCLcBGAs/s1600/bayes.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="396" data-original-width="560" height="226" src="https://1.bp.blogspot.com/-pf_gBpB1fu4/XSGtBgjdpoI/AAAAAAAAFns/zgumWWJQ2k8MIPnK3NiRd72_5fCIZDSGwCLcBGAs/s320/bayes.png" width="320" /></a></div><br /><div class="twn-pitfall">The above diagram is for <em>illustration only</em>. There is no real hidden truth (as we will see a few articles from now) of whether the state was initially $|\phi\rangle\otimes|\varphi\rangle\otimes|\phi\rangle$ or otherwise. But the probabilities still obey all the standard laws, such as Bayes's theorem, so tree diagrams make sense to illustrate this.</div><br />So the "probability that the system was in $|\phi\rangle\otimes|\varphi\rangle$" (quotation marks! quotation marks!) is:<br /><br />$$\frac{\frac12|\langle\chi|\phi\rangle|^2}{\frac12|\langle\chi|\phi\rangle|^2 + \frac12|\langle\chi|\varphi\rangle|^2}$$<br />(which if $\langle\phi|\varphi\rangle=0$ is just $|\langle\chi|\phi\rangle|^2$) And analogously for the other possibility. So the collapse of sub-system 1 to $|\chi\rangle$ collapses the entire state to<br /><br />$$|\chi\rangle\otimes\left(\frac1{\sqrt2}\langle\chi|\phi\rangle\cdot|\varphi\rangle+\frac1{\sqrt2}\langle\chi|\varphi\rangle\cdot|\phi\rangle\right)$$<br />Or some normalisation thereof if $\langle\phi|\varphi\rangle\ne0$. You can confirm that if $|\chi\rangle=|\phi\rangle$ or $|\chi\rangle=|\varphi\rangle$, this reduces to $|\phi\rangle\otimes|\varphi\rangle$ or $|\varphi\rangle \otimes |\phi\rangle$ respectively as we expect.<br /><br />You can check that this is <b>precisely what you get from applying the projection operator</b> $|\chi\rangle\langle\chi|\otimes 1$ as a linear operator to the original state. The above argument can be repeated for a general vector in the tensored space, yielding the linearity of the tensored operator.<br /><br /><div class="twn-exercises">What about <b>inner products of tensored states</b>? Convince yourself that the inner product of $|\phi_1\rangle\otimes|\phi_2\rangle$ and $|\chi_1\rangle \otimes |\chi_2\rangle$ is $\langle\phi_1|\chi_1\rangle\langle\phi_2|\chi_2\rangle$.</div><br />There was the other case we mentioned -- we may have operators that are themselves entangled. What does this mean? Suppose we start with the factorable state:<br /><br />$$\frac12|\phi\rangle\otimes|\phi\rangle+\frac12|\phi\rangle\otimes|\varphi\rangle+\frac12|\varphi\rangle\otimes|\phi\rangle+\frac12|\varphi\rangle\otimes|\varphi\rangle$$<br />Then perform the observation corresponding to "are the states different from each other?" This is a projection onto the plane spanned by $(|\phi\rangle\otimes|\varphi\rangle,|\varphi\rangle\otimes|\phi\rangle)$, perpendicular to the plane spanned by $(|\phi\rangle\otimes|\phi\rangle,|\varphi\rangle\otimes|\varphi\rangle)$ (confirm that this is true based on our discussion above of applying inner products to tensored states) -- we can write this as:<br /><br />$$\left(|\phi\rangle\otimes|\varphi\rangle\right)\left(\langle\phi|\otimes\langle\varphi|\right) +<br />\left(|\varphi\rangle\otimes|\phi\rangle\right)\left(\langle\varphi|\otimes\langle\phi|\right)<br />$$<br />(check that this, and the system of "distributing bras" makes sense). Or alternatively:<br /><br />$$|\phi\rangle\langle\phi|\otimes|\varphi\rangle\langle\varphi|+|\varphi\rangle\langle\varphi|\otimes|\phi\rangle\langle\phi|$$<br />One can check that applying this operator to the factorable state indeed results in the entangled state (up to normalisation). For better insight, perform the operation on the factored form of the state.<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://1.bp.blogspot.com/-_grXpKdhg2s/XSI6bApIi8I/AAAAAAAAFoE/ohtG-MZb51I1g34o5G5fiXQlZ5ch3BQ6wCLcBGAs/s1600/factorable%2Bstates%2Band%2Boperators.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1400" data-original-width="808" height="400" src="https://1.bp.blogspot.com/-_grXpKdhg2s/XSI6bApIi8I/AAAAAAAAFoE/ohtG-MZb51I1g34o5G5fiXQlZ5ch3BQ6wCLcBGAs/s400/factorable%2Bstates%2Band%2Boperators.png" width="230" /></a></div><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: left;">In the representation shown in the above diagram, a <b>factorable state</b> is one given by a <b>single square region</b>, while a <b>factorable projection operator</b> is one that selects a square region out of the maximally distributed state. With this notion, it becomes clear that <b>an entangled operator (one that cannot be factored into operators on each Hilbert space) is the only kind of operator that projects a factorable state into an entangled state</b>. </div><br /><div class="twn-pitfall">We're not saying that entangled operators <em>always</em> project factorable states onto entangled ones -- you can just measure an irrelevant property of the system (e.g. you know each particle is either in the UK or France, and you check "are they both in India?"). But they are the only operators that <em>can</em>, and for any such operator, there exist factorable states that it entangles (almost by definition).</div><br /><div class="twn-pitfall">Note that we're only talking about <em>projection operators</em> above. We could certainly have factorable observables that enforce a partial measurement -- e.g. $X_1\otimes X_2$, which measures the product of the positions of the two particles -- but the projection operators onto each of the eigenstates of this operator are not factorable (check this).</div><br /><div class="separator" style="clear: both; text-align: left;">The following rules then determine the action of our operations on the tensored Hilbert space.</div><div class="separator" style="clear: both; text-align: left;"></div><ol><li>$(A\otimes B)(|\phi\rangle\otimes|\varphi\rangle)=(A|\phi\rangle)\otimes(B|\varphi\rangle)$ -- the tensored operators <b>associate</b> with the corresponding states.</li><li>The tensor product of two linear operators is <b>linear</b>.</li><li>The image of a <b>linear</b> combination of operators is the linear combination of the images, i.e. $(A+B)|\psi\rangle=A|\psi\rangle+B|\psi\rangle$.</li></ol></div>alice and bobcorrelationentanglementmeasurementpartial measurementquantum mechanicsstatisticstensor productSat, 06 Jul 2019 07:52:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-7514288790561704826Abhimanyu Pallavi Sudhir2019-07-06T07:52:00ZAnswer by Abhimanyu Pallavi Sudhir for Should it be obvious that independent quantum states are composed by taking the tensor product?
https://physics.stackexchange.com/questions/54896/should-it-be-obvious-that-independent-quantum-states-are-composed-by-taking-the/489138#489138
0<p>I think it is pretty obvious. Correct me if my argument is wrong somewhere.</p>
<p>In the classical case, if you want to describe e.g. the x-positions of two particles, you have a two-dimensional phase space to show the possible states -- and two is the sum of one and one. But a quantum state space is very different -- every point in the <span class="math-container">$x_1$</span> "axis" is a basis vector of its own, and likewise for <span class="math-container">$x_2$</span> -- the state vectors we speak of are vectors in the Hilbert space, and can be shown as distributions mapped on this <span class="math-container">$(x_1,x_2)$</span> plane, representing them as superpositions of these basis vectors.</p>
<p>So it makes perfect sense that the dimension of the product space is the product of the dimensions and not the sum. The total number of points in the <span class="math-container">$(x_1,x_2)$</span> plane -- which is the dimension of this new Hilbert space -- is the product of the number of points on the <span class="math-container">$x_1$</span> axis and the <span class="math-container">$x_2$</span> axis.</p>
<p>It's clear that the <em>probabilities</em> are multiplicative. Given states <span class="math-container">$|\phi\rangle=\sum p(x)|x\rangle$</span> and <span class="math-container">$|\psi\rangle=\sum q(y)|y\rangle$</span> in bases <span class="math-container">$|x\rangle$</span> and <span class="math-container">$|y\rangle$</span>, it is clear that the <em>magnitudes</em> of the components of the state <span class="math-container">$$|\phi\rangle\otimes|\psi\rangle=\sum r(x,y)|x\rangle\otimes|y\rangle$$</span> where <span class="math-container">$\otimes$</span> (is the desired product representing composition) of the combined system are <span class="math-container">$|r(x,y)|^2=|p(x)q(y)|^2$</span>. But -- as you ask in your question -- how do we know that <span class="math-container">$r(x,y)=p(x)q(y)$</span>?</p>
<p>The idea is quite simple, though -- suppose we have a state like </p>
<p><span class="math-container">$$\left( {\frac{1}{{\sqrt 2 }}\left| x \right\rangle + \frac{1}{{\sqrt 2 }}\left| y \right\rangle } \right) \otimes \left| z \right\rangle = \frac{u}{{\sqrt 2 }}\left| x \right\rangle \otimes \left| z \right\rangle + \frac{v}{{\sqrt 2 }}\left| y \right\rangle \otimes \left| z \right\rangle $$</span></p>
<p>Because we're representing two independent systems, we can just observe the first system, collapsing it to <span class="math-container">$|x\rangle$</span>: then the combined state based on the left-hand-side is collapsed to <span class="math-container">$|x\rangle\otimes|z\rangle$</span>. But based on the right-hand-side, this is <span class="math-container">$u|x\rangle\otimes|z\rangle$</span>, and thus <span class="math-container">$u=1$</span> and similarly for <span class="math-container">$v$</span>.</p>Mon, 01 Jul 2019 09:43:29 GMThttps://physics.stackexchange.com/questions/54896/-/489138#489138Abhimanyu Pallavi Sudhir2019-07-01T09:43:29ZAnswer by Abhimanyu Pallavi Sudhir for Closure under Lie Bracket -- how is $c''(0)$ promoted to $(f\circ c)''(0)$
https://math.stackexchange.com/questions/3264841/closure-under-lie-bracket-how-is-c0-promoted-to-f-circ-c0/3268462#3268462
0<p>Ah, never mind, it's obvious -- I just got confused because it's not true for all curves. Given <span class="math-container">$(f\circ c)''(t)$</span>, it's clearly equal to</p>
<p><span class="math-container">$$c''(t)\cdot\nabla f(t)+c'(t)\cdot\frac{d}{dt}\nabla f(t)$$</span></p>
<p>And since <span class="math-container">$c'(0)=0$</span> for the given curve, this is just equal to the first term.</p>Thu, 20 Jun 2019 09:53:36 GMThttps://math.stackexchange.com/questions/3264841/closure-under-lie-bracket-how-is-c0-promoted-to-f-circ-c0/3268462#3268462Abhimanyu Pallavi Sudhir2019-06-20T09:53:36ZDerivations and the Jacobi Identity
https://thewindingnumber.blogspot.com/2019/06/derivations-and-jacobi-identity.html
0Let's consider a new way to think of the Lie algebra to a group -- instead of just considering the tangent vector to be <i>at</i> the identity, we could smear it across the group to form a <b>vector field</b>, resolving questions of whether our tangent space "really needs to be" at the identity (the exponential map in matrix representation only exists in the traditional form if we're talking about tangent vectors at the identity, but we're free to write down the Lie algebra in this way).<br /><br />But not every vector field is a valid element of the Lie algebra. We need the vector field to be "<b>constant</b>" across the manifold in some sense so that that constant vector it equals is the tangent-space-at-the-identity element it corresponds to. But what exactly do we mean by "constant" on a Lie Group?<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://1.bp.blogspot.com/-Wv4qP6_NVOY/XQJ0FqkQ5HI/AAAAAAAAFmE/sImh5ae4NY82cWm_nUlQmc4bWe9zQEqlACLcBGAs/s1600/leftinvariantvectorfield.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="633" data-original-width="633" height="320" src="https://1.bp.blogspot.com/-Wv4qP6_NVOY/XQJ0FqkQ5HI/AAAAAAAAFmE/sImh5ae4NY82cWm_nUlQmc4bWe9zQEqlACLcBGAs/s320/leftinvariantvectorfield.png" width="320" /></a></div>In the case of the unit circle in the complex plane, we have an idea of what we want -- the vector field $T(M)$ is constant over the group if it is determined by the value at the identity as $T(M)=MT(0)$.<br /><br />Is this preserved in the matrix representation of the group? Well, yes, because the correspondence between complex numbers and spiral matrices is a homomorphism. We can use this as a motivation to define the condition for a vector field to be a Lie algebra on a matrix Lie group -- it needs to be a <b>left-invariant vector field</b>, i.e. we need that the value of the vector field determined as $T(M)=MT(0)$.<br /><br /><div class="twn-furtherinsight">Why left-invariant? Why not right-invariant? Why matrix multiplication at all? The choices made here are certainly arbitrary to some extent. When we study <b>abstract lie algebra</b>, we'll just have "left-multiplication by $M$" being replaced by a group action and the usage of matrix multiplication is a <b>choice of representation</b>. In the context of abstract Lie algebra, the "left-multiplication by $M$ we're interested in is really the <i>derivative</i> of the group homomorphism $M:G\to G$, which is a linear map between the tangent spaces at $I$ and $M$. You can show that this map is represented by matrix left-multiplication given a matrix representation (i.e. letting the group be $GL(n,\mathbb{C})$).</div><br /><hr /><br />Ok, why did we just do that? Why did we upgrade our tangent vectors to vector fields? If it wasn't obvious already, the <b>noncommutativity</b> of a Lie group is "the" feature of importance in a Lie group, at least in some neighbourhood of the identity (we will later find out exactly the kind of features that aren't determined by just the Lie bracket -- the important keywords here are <b>connected</b> and <b>compact</b>) -- if the Lie group is commutative, then the Lie algebra is just a vector space with no additional structure, and the Lie group is a "basically unique" choice.<br /><br />In our discussions of noncommutativity in the <a href="https://thewindingnumber.blogspot.com/2019/05/an-easy-way-to-see-closure-under-lie.html">last article</a>, we repeatedly referred to <i>flowing along a vector</i> -- the nature of noncommutativity is inherently "dynamical" in this sense. So we need to talk about <i>differentiating along the corresponding vector field to a tangent vector</i>.<br /><br />So let's upgrade our vector fields to derivative operators, or <b>derivations</b> $D$. These are operators on functions $f:G\to \mathbb{R}$ that tell you the derivative of $f$ in the direction of the vector field -- the left-invariant ones are a certain generalisation of the directional derivative operators.<br /><br />Well, what exactly is a derivation? On Euclidean space, directional derivatives can be imagined as stuff of the form $f\mapsto\vec{v}\cdot\nabla f$ -- but this requires the concept of a dot product which is quite weird within the context of matrix groups. But if you try to work this out on the unit circle (do it!), you might get an idea: we can define a <b>curve</b> $\gamma:\mathbb{R}\to G$ passing through a point and consider:<br /><br />$$f\mapsto(f\circ \gamma)'(t)$$<br />At the point and you get precisely the directional derivative in the direction $\gamma'(t)$ (show that this is right in Euclidean space, and make sure you understand why it is right/makes sense -- it's the chain rule, and a certain analogy exists to projecting matrices onto subspaces in linear algebra). And if we just want tangent vectors at the identity, we can just consider the operation $f\mapsto(f\circ \gamma)'(0)$.<br /><br />OK. Let's try to "<b>abstract out</b>" the properties of a derivation $D$, i.e. something that just allows us to define what a derivation is, abstractly, that is equivalent to being an operator of the above form.<br /><br />What makes an operator a directional derivative? Certainly it must be a linear operator -- but not every linear operator is a directional derivative. The key idea behind a directional derivative is that $D(f(x))$ is <b>determined in a specific way by $D(x)$, the rate at which $x$ changes</b> in the specified direction.<br /><br />How do we use this? Well, if you think about it a little bit, we can restrict $f$ to be analytic -- so we need:<br /><ol><li>$D(x)$ predicts $D(x^n)$ in the right way -- this is ensured by the <b>product rule</b> -- $D(fg)=f\ Dg + g\ Df$.</li><li>$D(x^n)$ for all $n$ predicts $D(a_0+a_1x+a_2x^2+\ldots)$ in the right way -- this is ensured by <b>linearity</b>.</li></ol><div class="twn-beg">If anyone can motivate the definition of a derivation without restricting to analytic functions, tell me.</div><br />An operator that satisfies these two properties is called a <b>derivation</b> -- one can prove additional properties from these axioms fairly easily, e.g. $D(c)=0$ for constant $c$, etc.<br /><br /><hr /><br />Let's think about why this whole construction above makes sense.<br /><br />Let $G$ be the group of translations of $\mathbb{R}$ -- one can parameterise them by the translated distance as $\Delta(p)$ with composition given by $\Delta(p)\Delta(q)=\Delta(p+q)$. Well, this is isomorphic to the additive group on the reals, and in turn to the multiplicative positive real numbers. We can consider the group to be acting on real analytic functions by translations of the domain: $\Delta_pf(x):=f(x+p)$ The Lie algebra is just spanned by the derivative of $\Delta(p)$ at the identity, that is:<br /><br />$$\Delta '(0) = \lim\limits_{h \to 0} \frac{{\Delta (h) - 1}}{h} = \frac{d}{{dx}}$$<br />And our Lie algebra members are all real multiples of $d/dx$ -- these are precisely the directional derivatives on $\mathbb{R}$. Similar constructions can be made on $\mathbb{R}^n$, or a general automorphism group.<br /><br />So we see that the "derivations" construction of the Lie algebra actually are <b>the tangent vectors on the Lie group identified as the automorphism group of some object</b>. If you've ever done some differential geometry, this gives you the motivation for treating partial derivatives as basis vectors.<br /><br /><div class="twn-pitfall">Our discussion of derivations so far works both for derivations (general vector fields on the manifold) and <b>point-derivations</b> (basically tangent vectors at a specific point). Under the first interpretations, though, we're <b>not actually interested in all derivations</b>, only the left-invariant ones. For example, in the example above, an operation of the form of $p(x)\frac{d}{dx}$ is linear and satisfies the product rule:<br /><br />$$p\frac{d(f\cdot g)}{dx}=g\cdot p\frac{df}{dx}+f\cdot p\frac{dg}{dx}$$<br />And why shouldn't it? It corresponds to a vector field all right -- $xe_x$. But this is not a <i>left-invariant vector field</i>.</div><br /><div class="twn-furtherinsight">Interpret the <b>Taylor series as the exponential map</b> from the Lie algebra to the Lie group! Make the "similar construction" in the multivariate case ($\mathbb{R}^n$) and interpret the <b>multivariate taylor series</b> as an exponential map -- i.e. that $\Delta=\exp\nabla$</div><div><div><br /></div><div><hr /></div><div><br /></div>The first thing that we can do with our formalism of point-derivations is give another proof of closure under the Lie Bracket: </div><div><br /></div><div>$$[D_1,D_2](fg)=f[D_1,D_2]g+g[D_2,D_1]f$$</div><div>I.e. that the Lie Bracket of two derivations is also a derivation. Check that the above is correct by expanding stuff out and using the product rule for $D_1$ and $D_2$.</div><div><br /></div><div>There's another way that derivations can be used to show closure under the Lie Bracket, which shows more closely the connection to the product rule for the second derivative discussed in the <a href="https://thewindingnumber.blogspot.com/2019/05/an-easy-way-to-see-closure-under-lie.html">previous article</a>.<br /><br />One might wonder if, like the directional derivative at the identity in the $c'(0)$ direction is given by $(f\circ c)'(0)$, the directional derivative at the identity in the $c''(0)$ direction may be given as $(f\circ c)''(0)$. Well, in general:<br /><br />$$(f\circ c)''(t)=c''(t)\cdot\nabla f(t)+c'(t)\frac{d}{dt}\nabla f(t)$$<br />Which since $c'(0)=0$, at $t=0$ is simply equal to the first term, the directional derivative in the $c''(0)$ direction. So we just need to show that $f\mapsto (f\circ c)''(0)$ is a derivation. This follows from the <b>Leibniz rule for the second derivative</b>, and the fact that the first derivative of $c$ is zero.</div><div><br /></div><div><hr /><br /><div>OK, one more thing before we actually do something useful -- something we haven't done before in other ways.<br /><br />This is an <b>extended pitfall prevention</b>, because I fell into this pit myself. When thinking about left-invariance of a vector field $D$, I formulated the idea in my head this way: the idea is that under $D$, we should get the same result if we differentiate (derivate?) $f$ at 0 or if we translate it forward by $x$ and derivate it at $x$. i.e. where $\phi^h$ represents the translation $f(x)\mapsto f(x-h)$, we want:<br /><br />$$D=\phi^{h}D\phi^{-h}$$</div></div><div>(<i>THIS IS WRONG!</i> This is a pitfall prevention, not an actual result!) And I looked at some simple Abelian cases, like the additive real group and the circle group and thought this was clearly true.<br /><br />But it's wrong. How do we know that? Well, let's consider the group action $\phi^{-h}D\phi^h$ -- certainly at $h=0$, it's the identity, so let's differentiate it (against $h$) at 0. We get, where $d\phi_0$ is the derivative of $\phi$ at 0:<br /><br />$$[d\phi_0, D]$$<br />Which isn't zero. So my argument must be wrong -- I must have assumed abelian-ness somehow.<br /><br />Here's the problem: the final left-multiplication by $\phi^h$ is fine -- it just brings the derived function back to the origin, but "translating the function forward and then differentiating it" messes things up when the direction you're differentiating in doesn't commute with the direction of translation. Draw some pictures of curved surfaces to convince yourselves of this.<br /><br />So left-multiplication determines a sort of "parallel transport" on the Lie Group, while right-multiplication is an "alternative" way to compare vectors in different tangent spaces, and its disagreement with left-multiplication determines the non-commutativity of the group. Well, this choice of left-multiplication vs right-multiplication is really a convention, arising from the choice of representation.<br /><br /><hr /></div><div><br />OK, the useful thing: Suppose we're interested in "nested Lie brackets" $[X,[Y,Z]]$. We're talking about conjugating $[Y,Z]$ as $\phi^p[Y,Z]\phi^{-p}$ where $d\phi_0=X$ so that to first-order in $p$:<br /><br />$$\phi^p[Y,Z]\phi^{-p}=[Y,Z]+p[X,[Y,Z]]$$<br />Since conjugation is a homomorphism, we can also write:<br />$$\begin{align}<br />\phi^p[Y,Z]\phi^{-p} &= [\phi^pY\phi^{-p},\phi^pZ\phi^{-p}] \\<br /> &= [Y+p[X,Y],Z+p[X,Z]] \\<br /> &= [Y,Z] + p([Y,[X,Z]]+[[X,Y],Z])\\<br />\Rightarrow [X,[Y,Z]]&=[Y,[X,Z]]+[[X,Y],Z]<br />\end{align}$$<br />Now, couldn't we have just have proven this by expanding everything out as commutators? Sure, but this provides more insight as to what's going on -- you might notice the resemblance to the product rule. Indeed, this identity -- <b>the Jacobi identity</b> -- is perhaps best stated as:<br /><br />"<b>A derivation $X$ acts through the Lie Bracket as a derivation on the space of derivations where "multiplication" is given by the Lie Bracket.</b>"<br /><br />In this sense, it's actually quite expected -- it results from the fact that the Lie Bracket is a bilinear operator obtained from <b>differentiating a group symmetry, conjugation</b> -- this mandates that it is a derivation.<br /><br />As it turns out, the Jacobi identity, along with the antisymmetry and the bilinearity, determines the Lie Algebra -- it is enough to "abstract out" the properties of a Lie Algebra. Why? This is something we will see over several articles, which will then allow us to motivate abstract Lie algebra.</div>derivationsdifferential geometryjacobi identitylie algebralie bracketlie groupslie theoryMon, 17 Jun 2019 01:38:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-7383994582760830555Abhimanyu Pallavi Sudhir2019-06-17T01:38:00ZClosure under Lie Bracket -- how is $c''(0)$ promoted to $(f\circ c)''(0)$
https://math.stackexchange.com/questions/3264841/closure-under-lie-bracket-how-is-c0-promoted-to-f-circ-c0
1<p>I've seen numerous different proofs that the tangent space to a Lie group is closed under <span class="math-container">$[\cdot,\cdot]$</span>, i.e. that the Lie Bracket of two derivations is a derivation -- e.g. considering and differentiating the curve <span class="math-container">$e^{\sqrt{t}X}e^{\sqrt{t}Y}e^{-\sqrt{t}X}e^{-\sqrt{t}Y}$</span>, or just showing that <span class="math-container">$[D_1,D_2]$</span> follows the product rule.</p>
<p>But one derivation I don't get comes from Timothy Goldberg's set of lecture notes <em><a href="http://pi.math.cornell.edu/~goldberg/Talks/Flows-Olivetti.pdf" rel="nofollow noreferrer">The Lie Bracket and the Commutator of Flows</a></em>. Here's the process:</p>
<ol>
<li>Define the curve <span class="math-container">$c(t)=\Phi_X^t\Phi_Y^t\Phi_X^{-t}\Phi_Y^{-t}(e)$</span>.</li>
<li>Show that <span class="math-container">$[X,Y]=\frac12c''(0)$</span>.</li>
<li>Define an operation <span class="math-container">$D:f(t)\mapsto (f\circ c)''(0)$</span>.</li>
<li>Show that <span class="math-container">$D$</span> is a derivation.</li>
</ol>
<p>It's Step 3 I don't get. How do we know this operator <span class="math-container">$D$</span> is what "upgrades" <span class="math-container">$[X,Y]$</span> into a vector field? How can we show that <span class="math-container">$[X,Y]$</span> is the direction in which <span class="math-container">$D$</span> differentiates <span class="math-container">$f$</span>?</p>lie-groupslie-algebraslie-derivativeMon, 17 Jun 2019 00:39:06 GMThttps://math.stackexchange.com/q/3264841Abhimanyu Pallavi Sudhir2019-06-17T00:39:06ZDealing with eigenspaces; noncommuting variables and another postulate
https://thewindingnumber.blogspot.com/2019/06/dealing-with-eigenspaces-noncommuting.html
0At the end of the <a href="https://thewindingnumber.blogspot.com/2019/06/projection-operators-generalised-borns.html">last article</a>, you might've wondered how one might talk about 3-dimensional position -- so far, we've only considered an operator representing a one-dimensional position, e.g. to find the $x$-co-ordinate of something. This is obviously insufficient. What we need to measure three-dimensional position is <b>three separate measurements</b> of the different spatial dimensions.<br /><br />But there's a problem here -- we know that upon an observation, the state vector is modified, in that it is replaced by some eigenstate of the observable in question. So after measuring the $x$-co-ordinate, if we measure the $x$-co-ordinate afterwards, are we "really measuring" the $y$-co-ordinate of the particle as it was in its initial state, or have we shaken it around a bit?<br /><br />Let's try to think very precisely about what's going on here. The first question to ask is -- how do the eigenstates of the $X$ operator look like?<br /><br />Well, because it's an observable, it must have eigenstates that produce a full eigenbasis. But if each eigenvalue <b>corresponded to just one eigenstate</b>, then we would only have information about the $x$-positions of particles, which is clearly insufficient to represent the entire state of a particle. So we must have each eigenvalue -- each $x$-position -- correspond to <b>an infinitude of states, an eigenspace,</b> corresponding to each position with the same $x$-position (which, remember, is their eigenvalue), <b>and their superpositions thereof</b>. And this makes a lot of sense -- each position is a state, but these positions give us the same values for the $x$-position.<br /><br />What this means is that each function of the form $g(y,z)\delta(x-x_0)$ is an eigenstate of the $X$ operator, with eigenvalue $x_0$. So we can have something like $g(y,z)=\delta(y-y_0)\delta(z-z_0)$, which would also be an eigenstate of the $Y$ and $Z$ operators (with eigenvalues $y_0$ and $z_0$ respectively), or some other linear combination, which would no longer be an eigenstate of $Y$ and $Z$.<br /><br />OK. So what happens when we observe $X$ taking the value $x$? You might think that the state just turns into some <b>randomly chosen eigenstate </b>with the observed eigenvalue $x$. But if you think about it, this would be quite <b>unphysical</b>, as this would mean our $X$-observation would <i><b>magically change</b></i> <b>our knowledge about the $y$ and $z$ positions</b> too (for example, if the state collapsed into a state that is also an eigenstate of $Y$, we would have accidentally completely measured the $y$ position) -- but we can certainly design experiments in which an observation of an $x$ position does not so radically rattle the particle in the $y$ and $z$ directions.<br /><br /><div class="twn-furtherinsight">Another way to think about this is that the eigenvalues of an operator are the measurements we're getting out. If a state is already in an eigenspace corresponding to the eigenvalue $\lambda$, and we "measure" the observable again (i.e. do nothing), the state shouldn't change.</div><br />So we don't want the observation to change the $y$ and $z$ probability information in any way -- so what we're looking for is a <b>projection of the state into the eigenspace</b> with eigenvalue $x$. This is in line with our discussion of the generalised Born's rule in the <a href="https://thewindingnumber.blogspot.com/2019/06/projection-operators-generalised-borns.html">last article</a> -- but it is an <b>additional postulate</b> of quantum mechanics, or rather generalises the existing postulate about states projecting randomly into eigenstates.<br /><br /><div class="twn-furtherinsight">Weren't the eigenvalues completely irrelevant? You ask. You can just make the eigenvalues whatever you want, they're just labels to the eigenstates, right? Not really -- the eigenvalues <em>are</em> exactly what you measure. You can choose to measure any function of them, but if you use a function that isn't injective, you are measuring <em>less information</em> about the system, and you're collapsing the state "less" in this sense.</div><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://1.bp.blogspot.com/-Et48Tk7LJKk/XPGTZ8d7geI/AAAAAAAAFlY/hfBffc8ndk0iGTaYB3E1f1YlgaoIj1VYwCPcBGAYYCw/s1600/projection.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="394" data-original-width="296" height="320" src="https://1.bp.blogspot.com/-Et48Tk7LJKk/XPGTZ8d7geI/AAAAAAAAFlY/hfBffc8ndk0iGTaYB3E1f1YlgaoIj1VYwCPcBGAYYCw/s320/projection.png" width="240" /></a></div><br />(<i>Unlike in the projection above, the subspace being projected onto upon measurement of X is itself an infinite-dimensional space, spanned by the different positions Y and Z an take. Oh, and we have to normalise the projected state.</i>)<br /><br />Something that we've seen in this discussion above is that there is a <b>common eigenbasis</b> for $X$, $Y$ and $Z$ -- specifically, the <b>position basis</b>, the basis of Dirac delta distributions centered at the different points in three-dimensional space. From linear algebra, this is equivalent to saying that $X$, $Y$ and $Z$ commute.<br /><br />What this means is that when you then go on to measure $Y$, and then $Z$, you end up in a state that is a common eigenstate of $X$, $Y$ and $Z$ -- so that you have precise values for each co-ordinate of the positions. And as the $X$ information is only altered in the $X$ observation, etc. so the probability distributions for each variable is the same <b>regardless of the order</b> you measure it in -- so <b>three-dimensional position is indeed well-defined</b> in quantum mechanics.<br /><br /><hr /><br /><div class="twn-pitfall">Just to be clear, the fact that $X$, $Y$ and $Z$ commute is a <b>postulate</b> -- equivalent to the physical claim that each position in space is in fact an eigenstate, that we can in fact pinpoint the position of a particle exactly. We <em>cannot</em> do this e.g. for position and momentum -- $(x,p)$ pairs cannot be considered eigenstates, as there is no simultaneous eigenstate of $X$ and $P$. So for example, you <b>can't just construct a spacefilling curve</b> in $(x,p)$ space to measure position and momentum simultaneously, because the parameters of the curve would simply not have any corresponding eigenstates. The $(x,p)$ space <b>does not exist in the Hilbert space</b>, there are no states that precisely put down the values of position and momentum. It is possible to construct quantum mechanical theories -- called <b>non-commutative quantum theories</b> -- in which the $(x,y,z)$ space isn't in the Hilbert space either, so that our perception of three-dimensional positions must necessarily be approximate.<br /><br />We're assuming here that this is not so, that three-dimensional space does form an eigenbasis for the $X$, $Y$ and $Z$ operators, that the representation of the $Y$ operator in the $X$ basis is indeed $\psi(x,y,z)\mapsto y\psi(x,y,z)$, not something weird and fancy.</div><br /><hr /><br />A very different picture arises when you have noncommuting variables. Suppose two operators $X$ and $P$ don't commute, i.e. there is no common eigenbasis for them. So once you observe $X$ and put it in some eigenspace of $X$, there is a non-zero probability that the state will have to be projected out of this $X$-eigenspace when $P$ is measured.<br /><br />So this means that the observables $X$ and $P$ <b>cannot be measured simultaneously</b>. Some specific bounds on the uncertainties will be discussed in the <a href="https://thewindingnumber.blogspot.com/2019/06/position-momentum-bases-fourier.html">next article</a>. For now, let's demonstrate an example of two noncommuting variables: <b>position</b> and <b>momentum</b> (in the same direction).<br /><br /><b>NOTE: We will show in the next article the given results about momentum being $-i\hbar \frac{\partial}{\partial x}$, etc. Just intuit them out here from its eigenvectors.</b><br /><br />As we've shown before, the position and momentum operators can be given in the position basis as $x$ and $-i\hbar\partial/\partial x$ respectively. What this means is that given a wavefunction $\psi(x)$, it transforms under these operators as $x\psi(x)$ and $-i\hbar\psi'(x)$ respectively (check that this makes sense -- especially for the position case -- and also that one can go the other direction and show that the <b>corresponding eigenvectors</b> of the position operator must be <b>Dirac delta functions</b>).<br /><br />So do these operators commute? Clearly not -- the eigenbasis of one is Dirac delta functions in $x$, the other's is sinusoids in $x$. But we can also verify this computationally:<br /><br />$$\begin{align}XP &= -i\hbar x \frac{\partial}{\partial x}\\<br />PX\{\psi(x)\}&=-i\hbar\frac{\partial}{\partial x}(x\psi(x))\\<br />&= -i\hbar\left[\psi(x)+x\psi'(x)\right]\\<br />\Rightarrow PX &= -i\hbar x\frac{\partial}{\partial x} -i\hbar\end{align}$$<br /><br />So we have the commutator $i[X,P]=-\hbar$ (why do we talk about $i[A,B]$? Because as it is easy to see, for any Hermitian $A$ and $B$, this is Hermitian, while $[A,B]$ is simply anti-Hermitian). This is the "purest" commutator -- a (scaled) Identity operator. Since we didn't use any other properties of position and momentum, <b>this is a property of all observables that are Fourier transforms of each other/canonically conjugate observables </b>(more on this in the next article).<br /><br /><hr /><br /><b>Exercise:</b> Write down the most generalised form of Born's rule accounting for generalised eigenspaces (the answer is identical to what we've already written, but make sure you understand it). Show, as in <a href="https://thewindingnumber.blogspot.com/2019/06/projection-operators-generalised-borns.html">the last article</a>, that the probability density of finding a particle somewhere in three-dimensional space is $|\Psi(x,y,z)|^2$ -- make sure you define $\Psi(x,y,z)$ clearly!<br /><br />commutingeigenspacesnoncommutingphysicsquantum mechanicsuncertainty principleMon, 03 Jun 2019 00:38:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-1341062659577307162Abhimanyu Pallavi Sudhir2019-06-03T00:38:00ZPosition, momentum bases and operators, Fourier transform, uncertainty
https://thewindingnumber.blogspot.com/2019/06/position-momentum-bases-fourier.html
0In this article, we'll assume the de Broglie relation for all particles -- i.e. that their momentum is given by $p=hf$. This is actually quite an incredible assumption, even if not surprising -- we've accepted that a particle is a wave in the sense of probability (the wave describes the probability amplitude densities of finding it at some point), but why at all should the spatial frequency of the probability wave relate to its momentum?<br /><br />Well, it's natural for you to find this assumption unsatisfactory. We've been quite liberal in assuming the de Broglie relation earlier when <a href="https://thewindingnumber.blogspot.com/2019/05/from-polarisation-to-quantum-mechanics.html">motivating quantum theory</a>, too -- we'll later produce some motivation for the de Broglie relation for photons, and discuss derivations <i>from</i> quantum mechanics, axiomatising our theory clearly to eliminate circularities. But for now, let's not.<br /><br />The key point of $p=hf$ is that for a sinusoidal wave $e^{i \cdot 2\pi f \cdot x}$ (so the probability density is uniform, and the standard deviation in the observation of the particle's position is infinite), the momentum takes a specific <i>definite</i> value, $hf$, with zero standard deviation.<br /><br />Well, what if the wavefunction isn't a simple sinusoid, but some other distribution $\Psi(x)$? If you did all the assigned exercises in the <a href="https://thewindingnumber.blogspot.com/2019/05/from-polarisation-to-quantum-mechanics.html">first article</a>, you should know the answer (if not, work it out before reading on). Classically, if you could write that wavefunction as a sum of sinusoids (i.e. use a Fourier transform), then each sinusoid would have its own momentum and there would be some chunk of your matter in each of those momenta, forming a momentum distribution. In quantum mechanics, you can't have chunks of a single quantum, so you this distribution is a probability distribution (still a probability <i>amplitude</i> distribution, because we want superposition). We'll use the notation $\Psi(p)$ to represent this "<b>momentum-space wavefunction</b>", and we'll see why soon.<br /><br />So it's not too hard to see that the frequency distribution is simply the Fourier transform of $\Psi(x)$, while the momentum-space wavefunction is given by:<br /><br />$$\Psi(p)=\frac1h \mathcal{F}_x^{p/h}(\Psi(x))$$<br />Where $\mathcal{F}_x^{p/h}(\Psi(x))$ is the Fourier transform of $\Psi(x)$ (which is a function of $f$) written with the variable substitution $f=p/h$. Note that we're considering the non-normalised Fourier transform, in terms of ordinary frequencies.<br /><br />Well, $\Psi(x)\, dx$ and $\Psi(p)\, dp$ are just the representations of the state vector in the position and momentum bases respectively. So the <b>inverse Fourier transform acts as a <i>change-of-basis matrix</i></b> from the position basis to the momentum basis. I.e.<br /><br />$$|\psi\rangle_P=F|\psi\rangle_X$$<br />This change-of-basis matrix $F^{-1}$ precisely represents the <b>eigenstates of the momentum operator</b> written in the position basis, and the corresponding eigenvalues are the actual values of the momenta. So we have eigenstates $\frac1h e^{ix \cdot 2\pi p / h} dp$ with corresponding eigenvalues $p$.<br /><br />Before going any further, let's make sure we know exactly what this means: our change-of-basis matrix $F^{-1}$ is an uncountably infinite-dimensional "matrix" whose "indices" are denoted as $(x,p)$ in the rows-by-columns format. Its general entry is $\frac1h e^{ix \cdot 2\pi p / h} dp$, and each column -- here's the important bit -- each column holds <i>p</i> constant and varies <i>x</i>, i.e. each column, i.e. each eigenstate of $P$ is a function of $x$.<br /><br />Anyway, so we're looking for a linear operator $P$ solving the eigenvalue problem (and we're just ignoring the scalar multiples):<br /><br />$$P e^{ix \cdot 2\pi p / h} = pe^{ix \cdot 2\pi p / h}$$<br />It should be quite clear that the operator we're looking for is:<br /><br />$$\begin{align}P &= \frac{h}{2\pi i}\frac{\partial}{\partial x} \\<br />&= -i\hbar \frac{\partial}{\partial x} \end{align}$$<br />We need to be clear that this is the representation of the momentum operator <i>in the position basis</i> -- in the momentum basis, its representation is simply "$p$" (i.e. its action on each eigenstate $|p\rangle$ is to multiply it by $p$). Similarly, it should be easy to show that in the momentum basis,<br /><br />$$X=i\hbar\frac{\partial}{\partial p}$$<br /><b>Exercise:</b> make sure you clearly know and understand what the eigenvectors and eigenvalues of $X$ and $P$ are, in both the position and momentum bases. Hint: something about the Dirac delta function.<br /><br /><hr /><br /><b>Derivation of Heisenberg and Robertson-Schrodinger uncertainty principles</b><br /><b><br /></b>We can derive a variety of "uncertainty principles" -- inequalities showing trade-off between the certainties of two observables -- with some basic algebraic manipulation. It is important to note that none of these individual uncertainty principles is really much more fundamental than any of the others (or at least I don't see in what way they can be) -- one can always make stronger bounds for the uncertainty, and many stronger bonds exist than the ones we're showing here -- but the <i>concept</i> of an uncertainty principle is crucial, in that it demonstrates the rigorously difference between <b>quantum mechanics and statistical physics</b>. In general, the noncommutativity of observables (having no shared eigenstates) is something that has no analog in classical physics.<br /><br />OK. So we'll show two statements about the product of uncertainties of two observables, $(\langle A^2\rangle - \langle A\rangle^2)^{1/2}(\langle B^2 \rangle - \langle A \rangle^2)^{1/2} $. Once again, there is nothing special about the specific relations we will show -- we can consider other combinations than products, like $\Delta a^2 + \Delta b^2$, and indeed, there exist uncertainty relations for such terms.<br /><br />Defining $A'=A-\langle A\rangle$ and $B'=B-\langle B\rangle $ for <b>Hermitian</b> (this is important!) $A$ and $B$, we see that:<br /><br />$$\begin{align}<br />\langle A'^2\rangle \langle B'^2 \rangle &= \langle \psi | A'^2 | \psi \rangle \langle \psi | B'^2 | \psi \rangle \\<br />&= \langle A' \psi | A' \psi \rangle \langle B' \psi | B' \psi \rangle \\<br />&\ge |\langle \psi | A' B' | \psi \rangle| ^ 2 \\<br />&= \left|\frac12 \langle\psi|A'B'+B'A'|\psi\rangle + \frac12\langle\psi|A'B'-B'A'|\psi\rangle\right|^2 \\<br />&= \frac14 |\langle\psi|A'B'+B'A'|\psi\rangle|^2 + \frac14|\langle\psi|A'B'-B'A'|\psi\rangle|^2 \\<br />&= \frac14 |\langle \{A-\langle A\rangle, B-\langle B\rangle\} \rangle| ^2 + \frac14 |\langle [A,B]\rangle|^2\\<br />&= \frac14 |\langle\{A,B\} \rangle - 2\langle A\rangle \langle B\rangle |^2 + \frac14|\langle[A,B]\rangle|^2\\<br />\Rightarrow \Delta a\,\Delta b &\ge \frac12 \sqrt{|\langle\{A,B\} \rangle - 2\langle A\rangle \langle B\rangle |^2 + |\langle [A,B]\rangle|^2}<br />\end{align}$$<br />This is the <b>Robertson-Schrodinger relation</b>.<br /><br />(Guide in case you get stuck somewhere -- <i>line 3</i>, Cauchy-Schwarz inequality; <i>line 4</i>, splitting into Hermitian and anti-Hermitian parts; <i>line 5</i>, magnitude of a complex number -- I'm not sure if I can give any better motivation for specifically considering the product of the standard deviations -- like I said, these specific relations are not really that fundamental. I guess we just want to illustrate the <b>point</b> of "the" uncertainty principle, regardless of the specific ways in which it is treated, and would like to get a simple form for it, regardless of how weak or strong it may be.)<br /><br />One may weaken the inequality further, writing (and this is equivalent to having ignored the real part in line 4, saying the magnitude of a complex number is at least that of the imaginary part):<br /><br />$$\Delta a\,\Delta b \ge \frac12 |\langle [A,B]\rangle|$$<br />This is the <b>Heisenberg uncertainty relation</b>. In particular, in the <a href="https://thewindingnumber.blogspot.com/2019/06/dealing-with-eigenspaces-noncommuting.html">last article</a>, we showed that for the position and momentum operators, $[X,P]=i\hbar$. So in this case, we get the celebrated identity:<br /><br />$$\Delta x\, \Delta p \ge \frac{\hbar}{2}$$<br />For canonically conjugate $X$ and $P$.<br /><br />As mentioned before, other stronger uncertainty relations exist for general observables. Some examples can be found on the Wikipedia page <a href="https://en.wikipedia.org/wiki/Stronger_uncertainty_relations">Stronger uncertainty relations</a> (<a href="https://en.wikipedia.org/w/index.php?title=Stronger_uncertainty_relations&oldid=874251670">permalink</a>).fourier transformsmomentum basisphysicsposition basisquantum mechanicswavefunctionSun, 02 Jun 2019 18:17:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-3571696487893417575Abhimanyu Pallavi Sudhir2019-06-02T18:17:00ZProjection operators, generalised Born's rule, position basis, wavefunction
https://thewindingnumber.blogspot.com/2019/06/projection-operators-generalised-borns.html
0At the end of the <a href="https://thewindingnumber.blogspot.com/2019/05/from-polarisation-to-quantum-mechanics.html">last article</a>, I asked you to investigate Born's rule for continuous variables like position and momentum.<br /><br />Well, the problem is that if $x$ is continuously distributed (i.e. we have an operator $X$ whose eigenvalues form a continuous spectrum $\Sigma_X$), typically $P(x=\lambda)=0$ -- and this gives us very little information about the actual probability distribution. What we're really interested in is $P(x\in B)$ for $B$ some subset of $\Sigma_X$.<br /><br /><div class="twn-furtherinsight">Technically, we need $B$ to be a "Borel subset", or "measurable subset". We will be omitting several such technicalities in the article, such as the need for the spectral theorem to define a "projection-valued measure" or "spectral measure" on an operator with a continuous spectrum -- this is something that will be covered in the <a href="https://thewindingnumber.blogspot.com/p/1103.html">MAO1103: Linear Algebra course</a>.</div><br />First, let's think about $P(x\in B)$ in the countable case. One can write $B=\{\lambda_1,\ldots\lambda_n\}$, and then simply say that<br /><br />$$P(x\in B)=\sum |\langle\psi|\phi_k\rangle|^2$$<br />But the term on the right is a Pythagorean sum -- specifically, it is the length-squared of the vector formed by summing all the projections of $|\psi\rangle$ onto the eigenstates $|\phi_1\rangle\ldots|\phi_k\rangle$. But this is the same as the length of the projection of $|\psi\rangle$ onto the span of these eigenstates.<br /><br />(<b>Note on notations:</b> From here onwards, we will use the notation $|\lambda\rangle$ to refer to the eigenvector corresponding to the eigenvalue $\lambda$ (if the eigenspace has dimension more than 1, we'll figure something out). We will use the notation $\{|B\rangle\}$ to refer to the span of the eigenvectors corresponding to the eigenvalues in $B$.)<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://1.bp.blogspot.com/-Et48Tk7LJKk/XPGTZ8d7geI/AAAAAAAAFlQ/_RK3VpoUgvoRp1dfUpK_zzE98L-8mqwfQCLcBGAs/s1600/projection.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="394" data-original-width="296" height="320" src="https://1.bp.blogspot.com/-Et48Tk7LJKk/XPGTZ8d7geI/AAAAAAAAFlQ/_RK3VpoUgvoRp1dfUpK_zzE98L-8mqwfQCLcBGAs/s320/projection.png" width="240" /></a></div>So we could just define a Hermitian <b>projection operator</b> $L_X(B)$ for any subset $B$ of the spectrum of $X$ -- it is an easy exercise to write down an explicit form for $L_X(B)$ in terms of the eigenvectors of $X$.<br /><br />Then the probability $P(x\in B)$ is simply $|L_X(B)|\psi\rangle|^2$. Recalling that a Hermitian projection operator satisfies $L^*=L=L^2$, we can write the <b><i>generalised Born's rule</i></b> as:<br /><br />$$\begin{align}P(x\in B) &= |L_X(B)|\psi\rangle|^2\\ &= \langle\psi|L_X(B)|\psi\rangle\end{align}$$<br />Well, this is interesting! In the last article, you proved that the <b>expected value</b> of an observable $X$ given a state $|\psi\rangle$ is given by $\langle\psi|X|\psi\rangle$. But here we have a <i><b>probability</b></i> given by the same expression. So we want to interpret our projection operators as some sort of "observable" -- we can omit the "Hermitian", since all observables are Hermitian.<br /><br />There's another place you might've seen something like this, and that is with <i>indicator variables</i> in probability and statistics -- <i>the expected value of an indicator variable for an event is the probability that the event occurs</i>.<br /><br />Try to interpret these projection operators as observables that are analogous to "indicators" in some sense. If you think a little about it, you might see exactly what these observables represent: the eigenvalues of $L_X(B)$ are all 1 and 0 -- if the value "1" is realised, the state has been projected into the $\{|B\rangle\}$ -- and if the value "0" is realised, it hasn't.<br /><br />So projection operators are a special type of observable, measuring the answer to "<b>Yes/No questions</b>" -- if the answer to "is the system in one of the states $\{|B\rangle\}$?" is <b>yes</b>, the observable $L_X(B)$ takes the value 1 -- if the answer is <b>no</b>, then it takes the value 0. So it is precisely an "<b>indicator variable</b>" for $\{|B\rangle\}$.<br /><br />We have seen such projection operators, of course, in the context of polarisation -- where the operator represented whether or not the photon has passed through. Indeed, one may formulate quantum mechanics entirely in terms of projection operators, as any question can be formulated with some number of Yes/No questions (the key reason why this can be done, as we will see -- is that these "yes/no questions" all commute, i.e. the corresponding projection operators share an eigenbasis). Let's not.<br /><br /><hr /><br />Well, this can be generalised in the straightforward way to an operator with a continuous spectrum, resulting in the same expression. We can also calculate probability <i>densities</i> using this result. Let $X$ be an operator with continuous spectrum $\Sigma_X$ -- then we can write the state $|\psi\rangle$ in the eigenbasis of $X$:<br /><br />$$ |\psi\rangle = \int_{\Sigma_X} |x\rangle\, \Psi(x)\, dx $$<br />Where $\Psi(x)\, dx=\langle\psi|x\rangle$ are the coefficients of the state in the eigenbasis, i.e. the probability amplitudes -- we call $\Psi(x)$ the <b>wavefunction</b>, and it represents <i>probability amplitude densities</i>. Then for some set $M\subseteq \Sigma_X$ of eigenvalues $L_X(B)|\psi\rangle$ is the projection:<br /><br />$$ L(M)|\psi\rangle = \int_B |x\rangle\, \Psi(x)\, dx $$<br />And one may calculate the dot product, noting that complex dot products require taking the complex conjugate:<br /><br />$$ \langle \psi | L_X(B) | \psi \rangle = \int_B \Psi^*(x)\, \Psi(x)\, dx $$<br />Which gives us an expression for the <b>probability density function</b> on $\Sigma_X$ as:<br /><br />$$\begin{align}\rho(x) &=\Psi^*(x)\,\Psi(x) \\<br />&=|\Psi(x)|^2\end{align}$$<br />And this applies to any operator with a continuous spectrum, like position and momentum.<br /><br /><hr /><br /><div class="twn-pitfall">Some texts define the eigenvectors $|x\rangle$ of a continuous-spectrum observable differently from us -- it is often conventional to let $|x\rangle$ be infinitely large so that $\langle x_1|x_2\rangle = \delta(x_1-x_2)$. This is so that the amplitudes $\langle\psi|x\rangle$ are not infinitesimal, but instead $\langle\psi|x\rangle=\Psi(x)$ (without multiplication by $dx$). For consistency with discrete spectra, we do not use this convention.</div>born's ruleindicator variableslinear algebraposition basisprobabilityprojectionsquantum mechanicswavefunctionSat, 01 Jun 2019 12:13:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-7733053951235093302Abhimanyu Pallavi Sudhir2019-06-01T12:13:00ZAnswer by Abhimanyu Pallavi Sudhir for Is there a mathematical basis for Born rule?
https://physics.stackexchange.com/questions/215602/is-there-a-mathematical-basis-for-born-rule/483618#483618
1<p>One motivation comes from looking at light waves and polarisation -- when light passes through some filter, the energy of a light wave is scaled by <span class="math-container">$\cos^2\theta$</span> -- for a single photon, this means (as you can't have <span class="math-container">$\cos^2\theta$</span> of a photon) there is a probability of <span class="math-container">$\cos^2\theta$</span> that the number of photons passing through is "1". This <span class="math-container">$\cos\theta$</span> is simply the dot product of the "state vector" (polarisation vector) and the eigenvector of the number operator associated with polarisation filter with eigenvalue 1 -- i.e. the probability of observing "1" is <span class="math-container">$|\langle\psi|1\rangle|^2$</span>, and the probability of observing "0" is <span class="math-container">$|\langle\psi|0\rangle|^2$</span>, which is Born's rule.</p>
<p>So if you're motivating the state vector based on the polarisation vector, you can motivate Born's rule from <span class="math-container">$E=|A|^2$</span>, as above.</p>
<p>More abstractly, if you accept the other axioms of quantum mechanics, Born's rule is sort of the "only way" to encode probabilities, as you want probability of the union of disjoint events to be additive (equivalent to the Pythagoras theorem) and the total probability to be one (the length of the state vector is one). </p>
<p>But there is no way to "derive" the Born rule, it is an axiom. Quantum mechanics is fundamentally quite different to e.g. relativity, in the sense that it develops a whole new abstract mathematical theory to connect to the real world. So unlike in relativity, you don't have two axioms that are literally the result of observation and everything is derived from it -- instead, you have an axiomatisation of the mathematical theory, and then a way to connect the theory with observation, which is what Born's rule is. Certainly the <em>motivation</em> for quantum mechanics comes from wave-particle duality, but this is not an axiomatisation.</p>Fri, 31 May 2019 23:33:47 GMThttps://physics.stackexchange.com/questions/215602/-/483618#483618Abhimanyu Pallavi Sudhir2019-05-31T23:33:47ZFrom polarisation to quantum mechanics: states, observables, Born's law
https://thewindingnumber.blogspot.com/2019/05/from-polarisation-to-quantum-mechanics.html
0Like most texts on the theory, I will motivate the mathematics of quantum mechanics from the example of polarisation -- mostly because it's a very accessible example of stuff being wavelike. From this example, we will be able to motivate: the <b>state vector</b> (generalising the polarisation), <b>state vector collapse</b> (the event of polarisation), <b>observables and their eigenvalues</b> (stuff like energy, number of photons, etc.), <b>eigenstates and their orthogonality </b>(polarisation basis), <b>noncommuting operators and uncertainty</b> (the noncommuting of lenses).<br /><br />The key feature of quantum mechanics -- the fundamentally probabilistic nature -- comes from the following two facts, confirmed by experiments (the famous experiments here are the <b>double-slit experiment</b> and <b>photoelectric effect</b> respectively):<br /><br /><ul><li>Everything is a <b>wave</b> -- objects behave as waves, following the superposition principle and the waves represent densities of observations at large scales.</li><li>Everything is a <b>particle </b>-- which manifests itself in the form of some stuff, like energy and momentum, coming in little quanta.</li></ul><div><br /></div><div>This is the principle of <b>wave-particle duality</b>. You may realise how this implies a probabilistic description, but the following example should make it quite clear: consider a wave of light, with energy $hf$ (so it's a single photon) polarised at angle $\theta$ to the horizontal -- and it passes through a horizontal polarising filter. Well, then the wave that passes through would be a horizontally polarised wave with energy $hf\cos^2\theta$, right?<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://1.bp.blogspot.com/-g0BSGI23_Tw/XO5d5XpI5SI/AAAAAAAAFks/y51gFRd5w1MYIRBI54_95upCVVU-hiKQACLcBGAs/s1600/4.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="545" data-original-width="615" height="283" src="https://1.bp.blogspot.com/-g0BSGI23_Tw/XO5d5XpI5SI/AAAAAAAAFks/y51gFRd5w1MYIRBI54_95upCVVU-hiKQACLcBGAs/s320/4.jpg" width="320" /></a></div><br />But this is <i>impossible</i>, since energy levels in quantum mechanics are quantised -- you can't <i>have</i> $\cos^2\theta$ of a photon, you can only have integer multiples of a photon. But the fact that energy drops as $E\cos^2\theta$ is something that you can verify at your home, using sunglasses -- what the heck?<br /><br />The key point is that the empirical verification of the $\cos^2\theta$ business that you can do at home is on a <b><i>macroscopic</i> level</b>, when you have a large number of photons $E=Nhf$. So something occurs with the photons on a <b>microscopic level</b> such that when you try it with a <b>large number of photons</b>, $\cos^2\theta$ of the photons pass through.<br /><br />Well, this is <b>precisely the (frequentist) definition of probability</b>! A single photon passes through the filter with a <i>probability</i> of $\cos^2\theta$ so that for a large number of photons, $\cos^2\theta$ of the photons pass through. This is a non-trivial result -- wave-particle duality makes no mentions of probability as such, it just tells us that stuff is both a particle and a wave, but this simple condition in itself implies a probabilistic, non-deterministic reality.<br /><br /><hr /><br />Similar thought experiments can illustrate the probabilistic nature of other things (the "things" in question here will soon be called "eigenstates"): <b>position</b> is easy -- consider a standing wave photon in a box (this can easily be constructed). This is uniformly distributed throughout the box -- so how much of the energy is in some chunk of the box?<br /><br /><b>Momentum</b> is trickier, but shouldn't be too hard if you're familiar with Fourier transforms -- what's the analog of a "box" in momentum-space? Well, consider a concentrated pulse of light -- this can be written, via a Fourier transform, as the sum of several light waves of different momenta (i.e. frequencies), each wave with some lower energy. Taking "some chunk" of this "box" amounts to filtering some specific frequencies of the light. This can be done easily, e.g. with a colour filter -- so how much of the energy is contained in the waves with these specific momenta?<br /><br />In both cases, the key point is that you can't have a fraction of the energy of the photon at these positions/momenta, so you must have a <i>probability</i> of measuring the photon to be in a specific range of positions or a specific range of positions -- to be in a specific region or in a specific region of momentum-space.<br /><br /><hr /></div><div><br />The fundamental point here can be made for any quantity $X$: if you can filter out the "part" of a collection of particles that has $X$ in a certain subset of its range, then on a microscopic level, is <i>probabilistic</i>. The act of "filtering out the parts with a certain $X$", applied to a single particle, is just the act of checking if a particle is in a certain $X$-interval, and is called <b>measurement</b>. Any quantity that you can measure is called an <b>observable</b>. </div><div><br /></div><div class="twn-furtherinsight">Something like polarisation is really a form of measurement -- you're <i>finding out</i> whether or not the photon is in a certain polarisation $|\phi_{\parallel}\rangle$. You may have another observable, corresponding to a different polarisation -- even one that is orthogonal to the first polarisation -- $|\phi_{\perp}\rangle$ and still get that the photon is in $|\phi_\perp\rangle$. There is nothing wrong with this, as we just know beforehand that the photon is in $|\phi_\parallel\rangle$ <em>or</em> $|\phi_\perp\rangle$. If you perform the polarisation with $|\phi_\perp\rangle$ <em>after</em> the polarisation with $|\phi_\parallel\rangle$, you will find that the photon doesn't pass through, as you know for sure that the photon is not in both $|\phi_\parallel\rangle$ <em>and</em> $|\phi_{\perp}\rangle$.<br /><br />Now, you may have certain psychological issues with this, as have many in history -- however, you might want to note that the aim of quantum mechanics is not to fix your psychological, psychiatric etc. problems but to explain nature. You need to accept logical positivism and learn to <em>shut up and calculate</em> to be comfortable with quantum mechanics -- I recommend reading <a href="https://thewindingnumber.blogspot.com/2017/08/three-domains-of-knowledge.html">Three Domains of Knowledge</a>.</div><div><br /></div><div>So whatever calculus we invent to describe these probabilistic phenomena, it is going to apply to all <i>observables</i>.</div><div><br /></div><div>In our first example, the polarisation of the photon can be represented by a unit vector which we will denote as $|\psi\rangle$. The polarising filter has two special axes, represented by unit vectors $|\phi_{\parallel}\rangle$ and $|\phi_\perp\rangle$ -- these are special in the sense that an incoming photon polarised as $|\phi_{\parallel}\rangle$ or $|\phi_\perp\rangle$ will simply be scaled, by factors of 1 and 0 respectively -- so these form an <b>eigenbasis</b> for a certain operator.</div><div><br /></div><div>Well, we said that the photon passes through (with polarisation $|\phi_{\parallel}\rangle$) with probability $\cos^2\theta$ -- this arises simply from considering the <i>amplitude of $|\psi\rangle$ in the direction of $|\phi_\parallel\rangle$</i>. So we can write the <b>probability that the photon ends up in a state</b> $|\phi\rangle$ as $|\langle\psi|\phi\rangle|^2$ where $\langle\psi|\phi\rangle$ is called the corresponding "<b>probability amplitude</b>".<br /><br /><b>This expression, $P(x=\lambda)=|\langle\psi|\phi_\lambda\rangle|^2$ is called Born's rule.</b></div><div><br /></div><div>Let's get back to the eigenbasis -- what exactly is this an eigenbasis of? We said that the corresponding eigenvalues are 1 and 0, so this gives us a complete description of the operator. Note that this operator depends <i>only</i> on the observable (namely "number of photons in the $|\phi_{\parallel}\rangle$ direction), not on the state or any other feature of the observation. So we decide to <b><i>call</i> this operator/matrix the "observable"</b>, and its eigenvalues are the values of the observable that can be measured.<br /><br />To find properties of these observables, the natural way is to note that the only feature we've really required of them is Born's rule, i.e. the probabilistic interpretation -- so we can apply the axioms of probability and see what they apply in the context of these observables.<br /><br /><ul><li><b>$P(E)\ge 0$ --</b> imply that the observables are over either the reals or complexes, so that $|\langle\psi|\phi\rangle|^2\in \mathbb{R}$ in the first place. The nonnegativity then follows.</li><li><b>$P(\Omega)=1$ and </b><b>$P\left(\bigcup_i E_i\right) = \sum_i P(E_i)$ for disjoint $E_i$ -- </b>this, along with the second axiom, implies that $\sum |\langle\phi|\psi\rangle|^2 = 1=|\langle\psi|\psi\rangle|^2$ where the sum is taken over all eigenstates $|\phi\rangle$ of the operator. As this must be true for all states $|\psi\rangle$, the thing on the left must be a Pythagorean sum, so the $|\phi\rangle$s must form an orthogonal basis. This implies that all <b>observables are normal operators</b>.</li></ul><div><br /></div><div>The latter fact is very important, and can also be seen in the following way -- if you a system is in one eigenstate, it cannot possibly collapse onto another eigenstate (the probabilistic interpretation is: if you know for sure the value of the symbol is a thing, it's that thing) -- so we must have $|\langle \phi_1|\phi_2\rangle|^2=0$ for all eigenstates $|\phi_1\rangle$ and $|\phi_2\rangle$.</div><div><br /></div><div>Another restriction we add is that the observables be not only normal, but <b>Hermitian operators</b> in particular, so they have real eigenvalues. This may seem an odd choice, but it makes sense, as any normal operator may be uniquely written as $X_H+iX_{AH}$ where $X_H$ and $X_{AH}$ are Hermitian, and $X_H$ and $X_{AH}$ commute, so any complex observation can be done unambiguously as two real observations. So we stick to real eigenvalues.</div><div><br /></div><div>This also makes it essential that we allow complex operators rather than just real ones (the two choices were given to us from the first probability axiom), so that this decomposition is possible. Later, we will see concrete examples of this with commutators $[X,Y]$, which must be multiplied by $i$ to turn Hermitian. We will also see more fundamental reasons to choose complex numbers in QM.</div><div><br /></div><div><div><hr /></div><div><br /><b>Exercise:</b> Show that the expected value of an observable $X$ given a state $\psi$ can be given as $\langle \psi|X|\psi \rangle$ (i.e. $\psi^*X\psi$ in conventional notation).<br /><br /><b>Exercise:</b> Explain Born's rule with other observables, like position and momentum. Explain why it holds in general.</div></div></div>born's ruleeigenvalueseigenvectorshermitian matrixhilbert spacelinear algebrameasurementnormal matrixobservablesphysicspolarisationprobabilityquantum mechanicswave-particle dualityWed, 29 May 2019 22:11:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-241005614591950294Abhimanyu Pallavi Sudhir2019-05-29T22:11:00ZAnswer by Abhimanyu Pallavi Sudhir for Why do we use Hermitian operators in QM?
https://physics.stackexchange.com/questions/39602/why-do-we-use-hermitian-operators-in-qm/482816#482816
0<p>The point of eigenstates and the entire linear algebra of quantum mechanics is that the projections <span class="math-container">$\langle\phi|\psi\rangle$</span> of the state <span class="math-container">$|\psi\rangle$</span> onto each eigenstate <span class="math-container">$|\phi\rangle$</span> represent the probability amplitudes of each eigenstate. In particular, this means:</p>
<p><span class="math-container">$$\sum |\langle\phi|\psi\rangle|^2 = 1=|\langle\psi|\psi\rangle|^2$$</span></p>
<p>Where the summation is taken over all the eigenstates of an operator. As this must be true for all states <span class="math-container">$|\psi\rangle$</span>, the thing on the left must be a Pythagoran sum, so the <span class="math-container">$|\phi\rangle$</span>s must form an orthogonal basis. Alternatively, one may just note that we must have <span class="math-container">$\langle \phi_1|\phi_2\rangle=0$</span> if the eigenvalues corresponding are distinct, as two distinct observations must be mutually exclusive.</p>
<hr>
<p>That shows that the matrices must be normal. That they are chosen to be Hermitian is non-essential, but useful, as has already been discussed.</p>Mon, 27 May 2019 22:20:33 GMThttps://physics.stackexchange.com/questions/39602/-/482816#482816Abhimanyu Pallavi Sudhir2019-05-27T22:20:33ZComment by Abhimanyu Pallavi Sudhir on What are your favorite instructional counterexamples?
https://mathoverflow.net/questions/16829/what-are-your-favorite-instructional-counterexamples/17285#17285
@ManfredWeis Would you recall the title of the post you meant to link to? Your link is an actively updated feed -- is it <a href="https://calculus7.org/2014/12/07/tossing-a-continuous-coin/" rel="nofollow noreferrer">this</a>?Sat, 25 May 2019 13:30:55 GMThttps://mathoverflow.net/questions/16829/what-are-your-favorite-instructional-counterexamples/17285?cid=829577#17285Abhimanyu Pallavi Sudhir2019-05-25T13:30:55ZAnswer by Abhimanyu Pallavi Sudhir for Are there other kinds of bump functions than $e^\frac1{x^2-1}$?
https://math.stackexchange.com/questions/101480/are-there-other-kinds-of-bump-functions-than-e-frac1x2-1/3236066#3236066
2<p>Here's how you can generate as many different kinds of bump functions as you want, for whatever definition of "kind" you may have:</p>
<ol>
<li>Start with any function <span class="math-container">$f(x)$</span> that <strong>grows faster than all polynomials</strong>, i.e. <span class="math-container">$\forall N, \ \lim_{x\to\infty}\frac{x^N}{f(x)}=0$</span>. Example: <span class="math-container">$e^x$</span>.</li>
<li>Then consider the function <span class="math-container">$g(x)=\frac1{f(1/x)}$</span>. This is a function that is flatter than all polynomials near zero, i.e. <span class="math-container">$\forall N,\ \lim_{x\to0}\frac{g(x)}{x^N}=0$</span>. This is a a <strong>smooth non-analytic</strong> function. For our example, we get <span class="math-container">$e^{-1/x}$</span>.</li>
<li>Consider the function <span class="math-container">$h(x)=g(1+x)g(1-x)$</span>. This, after zeroing out stuff outside the interval <span class="math-container">$(-1,1)$</span>, is a <strong>bump function</strong>. For our example, <span class="math-container">$e^{2/(x^2-1)}$</span>.</li>
<li>Scale and transform to your liking.</li>
</ol>
<p>Just do this with different "kinds" of growth functions <span class="math-container">$f$</span>, and you'll get different "kinds" of bump functions <span class="math-container">$h$</span>. So here are some functions I could generate with this method -- try to guess which functions they're from:</p>
<p><span class="math-container">$$\begin{array}{l}
h(x) = {e^{2/({x^2} - 1)}} \\
h(x) = (1 + x)^{1/(1 + x)}(1 - x)^{1/(1 - x)} \\
h(x) = \frac1{\frac1{1 + x}!\frac1{1-x}!} \\
h(x)=e^{-[\ln^2(1+x)+\ln^2(1-x)]}
\end{array}$$</span></p>
<p>And the more rapidly your <span class="math-container">$f(x)$</span> grows, the nicer your bump function <span class="math-container">$h(x)$</span> looks.</p>
<hr>
<p>Here's a Desmos applet to try this with different functions <span class="math-container">$f$</span>: <a href="https://www.desmos.com/calculator/ccf2goi9bj" rel="nofollow noreferrer"><strong>desmos.com/calculator/ccf2goi9bj</strong></a>. </p>
<p>If you're interested in smooth non-analytic functions, have a look at my post <a href="https://thewindingnumber.blogspot.com/2019/05/whats-with-e-1x-on-smooth-non-analytic.html" rel="nofollow noreferrer"><strong>What's with e^(-1/x)? On smooth non-analytic functions: part I</strong></a>.</p>Wed, 22 May 2019 18:36:36 GMThttps://math.stackexchange.com/questions/101480/are-there-other-kinds-of-bump-functions-than-e-frac1x2-1/3236066#3236066Abhimanyu Pallavi Sudhir2019-05-22T18:36:36ZAnswer by Abhimanyu Pallavi Sudhir for Why does Taylor’s series “work”?
https://physics.stackexchange.com/questions/480163/why-does-taylor-s-series-work/481556#481556
0<p>Adding to <a href="https://physics.stackexchange.com/a/480187/">Sympathiser's answer</a> -- one can see why the existence of functions like <span class="math-container">$e^{-1/x}$</span> is not surprising by rephrasing them as "<strong>functions that approach zero near zero faster than any polynomial</strong>". This is not fundamentally more surprising than e.g. functions that grow faster than every polynomial -- in fact, for any function <span class="math-container">$f(x)$</span> that grows faster than every polynomial, the function <span class="math-container">$\frac1{f(1/x)}$</span> approaches zero near zero faster than any polynomial.</p>
<p>So for rapidly growing <span class="math-container">$f(x)=e^x$</span>, one gets the corresponding smooth non-analytic <span class="math-container">$e^{-1/x}$</span>. For <span class="math-container">$x^x$</span>, one gets <span class="math-container">$x^{1/x}$</span>. For <span class="math-container">$x!$</span>, one gets <span class="math-container">$\frac{1}{(1/x)!}$</span>, and so on.</p>
<p>See my post <a href="https://thewindingnumber.blogspot.com/2019/05/whats-with-e-1x-on-smooth-non-analytic.html" rel="nofollow noreferrer"><strong>What's with e^(-1/x)? On smooth non-analytic functions: part I</strong></a> for a fuller explanation.</p>Wed, 22 May 2019 00:11:33 GMThttps://physics.stackexchange.com/questions/480163/-/481556#481556Abhimanyu Pallavi Sudhir2019-05-22T00:11:33ZWhat's with e^(-1/x)? On smooth non-analytic functions: part I
https://thewindingnumber.blogspot.com/2019/05/whats-with-e-1x-on-smooth-non-analytic.html
0When you first learned about the Taylor series, your intuition probably went something like this: you have $f(x)$, the derivative at this point tells you how $f$ changes from $x$ to $x+dx$ (which tells you $f(x+dx)$), the second derivative tells you how $f'$ changes from $x$ to $x+dx$, which recursively tells you $f(x+2\ dx)$, the third derivative tells you $f(x+3\ dx)$, and so on -- so if you have an <i>infinite </i>number of derivatives, you know how <i>each</i> derivative changes, so you should be able to predict the <i>full global behaviour of the function</i>, assuming it is infinitely differentiable (smooth) throughout.<br /><br />Everything is nice and dandy in this picture. But then you come across two disastrous, life-changing facts that make you cry for those good old days:<br /><ol><li><b>Taylor series have <i>radii of convergence</i> -- </b>If I can predict the behaviour of a function up until a certain point, why can't I predict it a bit afterwards? It makes sense if the function becomes rough at that point, like if it jumps to infinity, but even functions like $1/(1+x^2)$ have this problem. Sure, we've heard the explanation involving complex numbers, but why should we care about the complex singularities (here's a question: do we care about quaternion singularities?)? Specifically, a Taylor series may have a zero radius of convergence. Points around which a Taylor series has a zero radius of convergence are called <b>Pringsheim points</b>.</li><li><b>Weird crap --</b> Like $e^{-1/x}$. Here, the Taylor series <i>does</i> converge, but it converges to the wrong thing -- in this case, to zero. Points at which the Taylor series doesn't equal a function on any neighbourhood, despite converging, are called <b>Cauchy points</b>.</li></ol><div>In this article, we'll address the <b>weird crap -- </b>$e^{-1/x}$ (or "$e^{-1/x}$ for $x>0$, 0 for $x= 0$" if you want to be annoyingly formal about it) will be the example we'll use throughout, so if you haven't already seen this, go plot it on Desmos and get a feel for how it looks near the origin.<br /><br /><i>Terminology:</i> We'll refer to <b>smooth non-analytic functions</b> as <b>defective functions</b>.<br /><br /></div><hr /><br /><div>The thing to realise about $e^{-1/x}$ is that the Taylor series -- $0 + 0x + 0x^2 + ...$ -- <i>isn't wrong</i>. The truncated Taylor series of degree $n$ is the <i>best polynomial approximation </i>for the function near zero, and none of the logic here fails for $e^{-1/x}$. There is honestly no other polynomial that better approximates the shape of the function as $x\to 0$.<br /><div><br /></div><div>If you think about it this way, it isn't too surprising that such a function exists -- what we have is a function that <b>goes to zero</b> as $x\to 0$ <b>faster than any polynomial</b> does. I.e. a function $g(x)$ such that</div><div>$$\forall n, \lim\limits_{x\to0}\frac{g(x)}{x^n}=0$$</div><div>This is not fundamentally any weirder than a function that escapes to infinity faster than all polynomials. In fact, such functions are quite directly connected. Given a function $f(x)$ satisfying:</div><div>$$\forall n, \lim\limits_{x\to\infty} \frac{x^n}{f(x)} = 0$$</div><div>We can make the substitution $x\leftrightarrow 1/x$ to get</div><div>$$\forall n, \lim\limits_{x\to0} \frac{1}{x^n f(1/x)} = 0$$</div><div>So $\frac1{f(1/x)}$ is a valid $g(x)$. Indeed, we can generate plenty of the standard smooth non-analytic functions this way: $f(x)=e^x$ gives $g(x)=e^{-1/x}$, $f(x)=x^x$ gives $g(x)=x^{1/x}$, $f(x)=x!$ gives $g(x)=\frac1{(1/x)!}$ etc.<br /><br /></div><div><hr /><br /><div></div></div><div>To better study what exactly is going on here, consider Taylor expanding $e^{-1/x}$ around some point other than 0, or equivalently, expanding $e^{-1/(x+\varepsilon)}$ around 0. One can see that:</div></div><div>$$\begin{array}{*{20}{c}}{f(0) = {e^{ - 1/\varepsilon }}}\\{f'(0) = \frac{1}{{{\varepsilon ^2}}}{e^{ - 1/\varepsilon }}}\\{f''(0) = \frac{{ - 2\varepsilon + 1}}{{{\varepsilon ^4}}}{e^{ - 1/\varepsilon }}}\\{f'''(0) = \frac{{6{\varepsilon ^2} - 6\varepsilon + 1}}{{{\varepsilon ^6}}}{e^{ - 1/\varepsilon }}}\\ \vdots \end{array}$$</div><div>Or ignoring higher-order terms for our purposes,</div><div>$$f^{(N)}(0)\approx(1/\varepsilon)^{2N}e^{-1/\varepsilon}$$</div><div>Each derivative $\frac{e^{-1/\varepsilon}}{\varepsilon^{2N}}\to0$ as $\varepsilon\to0$, but they each approach zero <i>slower</i> than the previous derivative, and somehow that is enough to give the sequence of derivatives the "kick" that they need in the domino effect that follows -- from somewhere at $N=\infty$ (putting it non-rigorously) -- to make the function grow as $x$ leaves zero, even though all the derivatives were zero at $x=0$.</div><div><br /></div><div><hr /><br /></div><div><i>But</i> we can still make it work -- by letting $N$, the upper limit of the summation approach $\infty$ <i>first</i>, before $\varepsilon\to 0$. In other words, instead of directly computing the derivatives $f^{(n)}(0)$, we consider the terms</div><div>$$\begin{array}{*{20}{c}}{f_\varepsilon^{(0)} = f(0)}\\{{{f}_\varepsilon^{(1)} }(0) = \frac{{f(\varepsilon ) - f(0)}}{\varepsilon }}\\{{{f}_\varepsilon^{(2)} }(0) = \frac{{f(2\varepsilon ) - 2f(\varepsilon ) + f(0)}}{{{\varepsilon ^2}}}}\\{{{f}_\varepsilon^{(3)} }(0) = \frac{{f(3\varepsilon ) - 3f(2\varepsilon ) + 3f(\varepsilon ) - f(0)}}{{{\varepsilon ^3}}}}\\ \vdots \end{array}$$</div><div>And write the generalised <b>Hille-Taylor series</b> as:</div><div>$$f(x) = \mathop {\lim }\limits_{\varepsilon \to 0} \sum\limits_{n = 0}^\infty {\frac{{{x^n}}}{{n!}}f_\varepsilon ^{(n)}(0)} $$</div><div>Then $N\to\infty$ before $\varepsilon\to0$ so you "reach" $N\to\infty$ first (or rather, you get large $n$th derivatives for increasing $n$) before $\varepsilon$ gets to 0.</div><div><br /></div><div>Another way of thinking about it is that the "local determines global" stuff makes sense to predict the value of the function at $N\varepsilon$, countable $N$, but it's a stretch to talk about uncountably many $\varepsilon$s away, which is what a finite neighbourhood is. But with these difference operators in the Hille-Taylor series, one ensures that each neighbourhood is a finite multiple of $h$ away at any point, so the differences determine $f$.<br /><br /><hr /></div><div><b>Very simple (but fun to plot on Desmos) exercise: </b>use $e^{-1/x}$ or another defective function to construct a "<b>bump function</b>", i.e. a smooth function that is 0 outside $(0, 1)$, but takes non-zero values everywhere in that range.<br /><br />Similarly, construct a "<b>transition function</b>", i.e. a smooth function that is 0 for $x\le0$, 1 for $x\ge1$. (hint: think of a transition as going from a state with "none of the fraction" to "all of the fraction")<br /><br />If you're done, play around with this (but no peeking): <a href="https://www.desmos.com/calculator/ccf2goi9bj"><b>desmos.com/calculator/ccf2goi9bj</b></a></div>analysisanalytic functionscalculusconvergencemathematicssmooth functionstaylor seriesTue, 21 May 2019 23:57:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-8831094740684334216Abhimanyu Pallavi Sudhir2019-05-21T23:57:00ZData formats of inputs to arrange function in dplyr
https://stackoverflow.com/questions/56158114/data-formats-of-inputs-to-arrange-function-in-dplyr
0<p>Given a table <code>monkeys</code> with column <code>brain_size</code>, one can write something like <strong><code>arrange(monkeys, brain_size)</code></strong>. </p>
<p>I don't understand how this makes sense -- <strong><code>brain_size</code> isn't a declared variable</strong> (if I refer to it, I get an error). It's just the name of a column -- shouldn't you rather have <code>arrange(monkeys, 'brain_size')</code>? <strong><em>Isn't</em> the column name just a string?</strong></p>
<p>Another related weirdness -- </p>
<pre><code>arrange(monkeys, desc(brain_size))
</code></pre>
<p>Once again, what exactly is the <strong><code>desc</code> function</strong>? How can it take <code>brain_size</code> as an input? Shouldn't you have something like <code>arrange(monkeys, 'brain_size', desc = true)</code>?</p>
<p>Am I missing something? Perhaps <code>brain_size</code> is a variable in some way but can only be accessed when you're unambiguously "inside" <code>monkeys</code>.</p>rfunctiontypesdplyrWed, 15 May 2019 21:51:42 GMThttps://stackoverflow.com/q/56158114Abhimanyu Pallavi Sudhir2019-05-15T21:51:42ZThe Cauchy Riemann Equations: what do they really mean?
https://thewindingnumber.blogspot.com/2019/05/what-do-cauchy-riemann-equations-really.html
0<b>Question: <a href="https://math.stackexchange.com/a/3197879/78451">Geometrical Interpretation of Cauchy Riemann equations?</a></b><br /><br />One might think that being differentiable on $\mathbb{R}^2$ is sufficient for differentiability on $\mathbb{C}$. But the Jacobian of an arbitrary such function doesn't have a natural complex number representation.<br /><br />$$<br />\left[ {\begin{array}{*{20}{c}}<br />{\partial u/\partial x} & {\partial u/\partial y} \\<br />{\partial v/\partial x} & {\partial v/\partial y}<br />\end{array}} \right]<br />$$<br />Another way of putting this is that no complex-valued derivative (see below for an example) you can define for an arbitrary function fully captures the local behaviour of the function that is represented by the Jacobian.<br /><br />$$<br />\frac{df}{dz} = \left(\frac{\partial u}{\partial x} + \frac{\partial v}{\partial y} \right) + i\left(\frac{\partial v}{\partial x}-\frac{\partial v}{\partial y}\right)<br />$$<br />The idea is that we should be able to define a complex-valued derivative "purely" for the value $z$, without considering directions, i.e. we want to consider $\mathbb{C}$ one-dimensional in some sense (the sense being "as a vector space"). More precisely, the derivative in some direction in $\mathbb{C}$ should determine the derivative in all other directions in a natural manner -- whereas on $\mathbb{R}^2$, the derivatives in *two* directions (i.e. the gradient) determines the directional derivatives in all directions. <br /><br />If you think about it, this is quite a reasonable idea -- it's analogous to how not every linear transformation on $\mathbb{R}^2$ is a linear transformation on $\mathbb{C}$ -- only spiral transformations are.<br /><br />$$<br />\left[ {\begin{array}{*{20}{c}}<br />{a} & {-b} \\<br />{b} & {a}<br />\end{array}} \right]<br />$$<br />How would we generalise differentiability to an arbitrary manifold? Here's an idea: <b>a function is differentiable if it is locally a linear transformation</b>. So on $\mathbb{R}^2$, any Jacobian matrix is a linear transformation. But on $\mathbb{C}$, only Jacobians of the above form are linear transformations -- i.e. the only linear transformation on $\mathbb{C}$ is <b>multiplication by a complex number</b>, i.e. a spiral/amplitwist. So a complex differentiable function is one that is locally an amplitwist (geometrically), which can be stated in terms of the components of the Jacobian as:<br /><br />$$<br />\begin{align}<br />\frac{\partial u}{\partial x} & = \frac{\partial v}{\partial y} \\<br />\frac{\partial u}{\partial y} & = - \frac{\partial v}{\partial x} \\<br />\end{align}<br />$$<br />This is precisely why you shouldn't (and can't) view complex differentiability as some basic first-degree smoothness -- there is a much richer structure to these functions, and it's better to think of them via the transformations they have on grids.calculuscauchy-riemanncomplex analysisjacobianlinear algebraSun, 12 May 2019 23:35:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-3734514288105692758Abhimanyu Pallavi Sudhir2019-05-12T23:35:00ZLie Bracket, closure under the Lie Bracket
https://thewindingnumber.blogspot.com/2019/05/an-easy-way-to-see-closure-under-lie.html
0(If you're just here for the easy way to see closure, skip ahead to <a href="https://www.blogger.com/blogger.g?blogID=3214648607996839529#closure">Closure under the Lie Bracket</a>)<br /><br />In the <a href="https://thewindingnumber.blogspot.com/2019/04/introduction-to-lie-groups.html">previous article</a>, I introduced Lie Groups and Lie Algebras by talking about Lie Algebras as a parameterisation for the Lie Group -- we said that the elements of the Lie Group could be written as exponentials of these parameters (not uniquely, sure, but they can be written in this way). Some things to note here:<br /><ul><li>What we've called "Lie Groups" refers only to <b><i>connected</i> Lie Groups</b>, as motivation. In general, the theory of Lie groups considers <b>any group that is also a manifold </b>-- for instance, the non-zero real numbers are also a Lie Group (even though their Lie Algebra is identical to that of the positive real numbers -- can you see why?). We will hereby use this more general definition.</li><li>It's not really true that any Lie group can be parameterised in this fashion by writing each element as an exponential of a Lie Algebra element -- even for connected groups. This shouldn't be surprising -- given a term of the form $\exp X$ and a term $\exp Y$, their product $\exp X\exp Y$ is in the group by closure, but it isn't necessarily equivalent to $\exp(X+Y)$ on a non-Abelian group (could it be the exponential of something else? We'll find out later).</li><li>A <i>parameterisation</i> of this form is not the same as a <i>co-ordinate system</i>.</li></ul>The last point is what we will concentrate on in this article, because not being described fully by the Lie algebra is what makes things interesting, right?<br /><br />What is a co-ordinate system on a manifold? Well, they key point is that any element of the manifold can be decomposed in terms of its components along the co-ordinates. On a Lie Group, this means that there should exist a "basis" for the Lie Group $\exp(X_1),\ldots\exp(X_n)$ corresponding to the basis $X_1,\ldots X_n$ for the Lie Algebra vector space such that every element of the Lie Group can be written as products of powers of these elements, and any rearrangement of the terms in the product should leave it invariant (i.e. the elements should commute with each other).<br /><br /><div class="twn-pitfall">Note that it <em>is</em> possible to decompose elements of a connected Lie Group as a product of <em>some</em> exponentials, but this is different from there being specifically $n$ elements that one can write any Lie group element as products of.</div><br />But clearly, this can only be possible if the group is <i>Abelian</i>, commutative. This is a special case of the more general fact that only a <b>holonomic basis</b> gives rise to a co-ordinate system on a manifold. The idea is -- a closed loop should produce no overall group action. If you <b>flow</b> $\varepsilon$ in the $X$ direction, then flow $\varepsilon$ in the $Y$ direction, then flow $\varepsilon$ back in the $X$ direction and flow $\varepsilon$ back in the $Y$ direction, you should end up back where you started. If you don't, then the resulting difference is the infinitesimal "<b>group commutator</b>" of the Lie Group:<br /><br />$$e^{\varepsilon X}e^{\varepsilon Y}e^{-\varepsilon X}e^{\varepsilon Y}$$<br />One can check via a Taylor expansion that this is equal, to second order, to:<br /><br />$$1+\varepsilon^2(XY-YX)$$<br />The first thing to note about this is that the $\varepsilon^1$ term is zero -- this may seem like a surprising coincidence, but perhaps it isn't that surprising (I mean, there's nothing else it <i>could</i> be, right? If the commutator was to first-order $1+\varepsilon z$, $\exp z$ would be equal to 1, and so it would give no characterisation at all of the amount of non-commutativity of the flows $X$ and $Y$) -- it's analogous to vector calculus, where the <b>curl</b> of a vector field is proportional to $\varepsilon^2$ (i.e. a line integral along the curve is proportional to its area, so you divide it by this area in the definition of curl, etc.).<br /><br />The second-order term, $XY-YX$, is more interesting. This may seem weird because so far, we've been considering the Lie algebra purely as a <b>vector space</b>, with addition and scalar multiplication being the only things going on. But clearly, this cannot be the entire picture, or a connected Lie group would be characterised entirely by the dimension of its Lie algebra. This operation -- the <b>Lie Bracket</b> or <b>Lie Algebra commutator</b> represented by $[X,Y]$ -- as we will see, gives some additional structure to the Lie Algebra, and in fact characterises it (we'll see what this means).<br /><br />So far, we've obtained no motivation for why this operation $XY-YX$ is actually of any significance. Sure, it appeared in our second-order approximation for the group commutator, but is the group commutator we defined really so great? Surely there could be other ways one could measure the non-commutativity of a group. And the $\varepsilon^2$ business is <i>weird</i>. Things that arise proportional to $\varepsilon$ live in the tangent space, in the Lie Algebra. Where does $[X,Y]$ even live?<br /><br />Two facts will convince us that the Lie Bracket is indeed the "right" measure of non-commutativity of a Lie Algebra:<br /><br /><ul><li><b>The Lie Algebra is closed under the Lie Bracket -- </b>we will see that in fact, $[X,Y]$ lives <i>in the lie algebra</i>, so it is in fact a binary operation on the Lie Algebra, and really does add structure to the Lie Algebra.</li><li><b>It characterises the entire Lie Algebra -- </b>not only is it <i>part</i> of the structure of the Lie Algebra, it characterises the entire structure of the Lie Algebra. What this means is that defining the Lie Bracket on the vector space allows a full characterisation of the part of the group connected to the identity (the "connected part" of the group), so we can say that any Lie Algebras with the same dimension and Lie Bracket are isomorphic.</li></ul><div><br /></div><hr /><br /><a href="https://www.blogger.com/null" id="closure" name="closure"><b>Closure under the Lie Bracket</b></a><br /><br />If you're like me, you might've thought of several analogous situations to our $1+\varepsilon^2(XY-YX)$ expression -- e.g. in (complex) analysis, at a point where the derivative of a function is zero, the function is characterised by its <i>second</i> derivative (consult Needham's <i>Complex Analysis</i>, p. 205-207 for an explanation). Another example is -- if the first derivative of a function is zero, the second derivative satisfies the product rule (this is actually directly related, in a way we won't go into now).<br /><br />Here's an idea you <i>might</i> think of: as we discussed earlier, the infinitesimal group commutator is $e^{\varepsilon X}e^{\varepsilon Y}e^{-\varepsilon X}e^{-\varepsilon Y}= 1+\varepsilon^2 (XY - YX) + O(\varepsilon^3)\in G$. But for a moment let $\varepsilon$ not be infinitesimal. So $\varepsilon (XY - YX) + O(\varepsilon^2)\in \mathfrak{g}$, the Lie Algebra corresponding to Lie Group $G$, so by scaling $XY-YX+O(\varepsilon)\in\mathfrak{g}$ and by connectedness of the vector space $XY-YX\in\mathfrak{g}$.<br /><br />But this argument is <b>incorrect</b> -- this becomes obvious if you try to formally write it down -- In general, $1+\varepsilon T\in G$ does <b>not</b> imply $T\in\mathfrak{g}$ for non-infinitesimal $\varepsilon$. It's close to an element in $\mathfrak{g}$ (for small $\varepsilon$), but how close? You might get the feeling that it is "sufficiently close", in that the limit $\varepsilon\to0$ of the sequence $\left(c_\varepsilon(X,Y)-1\right)/\varepsilon^2$ (where $c_\varepsilon(X,Y)$ is the group commutator) indeed ends up in the Lie Algebra.<br /><br />To make this feeling formal, consider instead the curve parameterised differently as $\gamma(\varepsilon)=e^{\sqrt\varepsilon X}e^{\sqrt\varepsilon Y}e^{-\sqrt\varepsilon X}e^{-\sqrt\varepsilon Y}$. Then $\gamma'(0)=XY-YX$, and we're done.<br /><br /><div class="twn-furtherinsight">think about the Taylor expansion here of this new curve for a while</div>holonomic co-ordinateslie algebralie bracketlie groupslie theorymathematicsMon, 06 May 2019 21:21:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-2049429296050963446Abhimanyu Pallavi Sudhir2019-05-06T21:21:00ZAnswer by Abhimanyu Pallavi Sudhir for Geometrical Interpretation of Cauchy Riemann equations?
https://math.stackexchange.com/questions/1026134/geometrical-interpretation-of-cauchy-riemann-equations/3197879#3197879
1<p>One might think that being differentiable on <span class="math-container">$\mathbb{R}^2$</span> is sufficient for differentiability on <span class="math-container">$\mathbb{C}$</span>. But the Jacobian of an arbitrary such function doesn't have a natural complex number representation.</p>
<p><span class="math-container">$$
\left[ {\begin{array}{*{20}{c}}
{\partial u/\partial x} & {\partial u/\partial y} \\
{\partial v/\partial x} & {\partial v/\partial y}
\end{array}} \right]
$$</span></p>
<p>Another way of putting this is that no complex-valued derivative (see below for an example) you can define for an arbitrary function fully captures the local behaviour of the function that is represented by the Jacobian.</p>
<p><span class="math-container">$$
\frac{df}{dz} = \left(\frac{\partial u}{\partial x} + \frac{\partial v}{\partial y} \right) + i\left(\frac{\partial v}{\partial x}-\frac{\partial v}{\partial y}\right)
$$</span></p>
<p>The idea is that we should be able to define a complex-valued derivative "purely" for the value <span class="math-container">$z$</span>, without considering directions, i.e. we want to consider <span class="math-container">$\mathbb{C}$</span> one-dimensional in some sense (the sense being "as a vector space"). More precisely, the derivative in some direction in <span class="math-container">$\mathbb{C}$</span> should determine the derivative in all other directions in a natural manner -- whereas on <span class="math-container">$\mathbb{R}^2$</span>, the derivatives in <em>two</em> directions (i.e. the gradient) determines the directional derivatives in all directions. </p>
<p>If you think about it, this is quite a reasonable idea -- it's analogous to how not every linear transformation on <span class="math-container">$\mathbb{R}^2$</span> is a linear transformation on <span class="math-container">$\mathbb{C}$</span> -- only spiral transformations are.</p>
<p><span class="math-container">$$
\left[ {\begin{array}{*{20}{c}}
{a} & {-b} \\
{b} & {a}
\end{array}} \right]
$$</span></p>
<p>How would we generalise differentiability to an arbitrary manifold? Here's an idea: <strong>a function is differentiable if it is locally a linear transformation</strong>. So on <span class="math-container">$\mathbb{R}^2$</span>, any Jacobian matrix is a linear transformation. But on <span class="math-container">$\mathbb{C}$</span>, only Jacobians of the above form are linear transformations -- i.e. the only linear transformation on <span class="math-container">$\mathbb{C}$</span> is <strong>multiplication by a complex number</strong>, i.e. a spiral/amplitwist. So a complex differentiable function is one that is locally an amplitwist (geometrically), which can be stated in terms of the components of the Jacobian as:</p>
<p><span class="math-container">$$
\begin{align}
\frac{\partial u}{\partial x} & = \frac{\partial v}{\partial y} \\
\frac{\partial u}{\partial y} & = - \frac{\partial v}{\partial x} \\
\end{align}
$$</span></p>
<p>This is precisely why you shouldn't (and can't) view complex differentiability as some basic first-degree smoothness -- there is a much richer structure to these functions, and it's better to think of them via the transformations they have on grids.</p>Tue, 23 Apr 2019 05:55:35 GMThttps://math.stackexchange.com/questions/1026134/-/3197879#3197879Abhimanyu Pallavi Sudhir2019-04-23T05:55:35ZTrace, Laplacian, the Heat equation, divergence theorem
https://thewindingnumber.blogspot.com/2019/04/trace-laplacian-heat-equation.html
0The aim of this article is to help build an intuition for the trace of a matrix, "the sum of the elements on the diagonal" -- the basic idea is that the trace is an "average" of some sort, an average of the action of an operator or a quadratic form. We'll make this idea clearer with an example from classical physics: the heat equation.<br /><br /><hr /><br />Consider an $n$-dimensional space with some temperature distribution $T(\vec{x},t)$. We wish to set up a differential equation for this function.<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://4.bp.blogspot.com/-WDr_mgo-qEg/XL2NKs0J8iI/AAAAAAAAFfc/RQlvQLKSZDklWwkAOd7jkVaq-XXcwCzcACLcBGAs/s1600/lawofcooling.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="177" data-original-width="361" height="156" src="https://4.bp.blogspot.com/-WDr_mgo-qEg/XL2NKs0J8iI/AAAAAAAAFfc/RQlvQLKSZDklWwkAOd7jkVaq-XXcwCzcACLcBGAs/s320/lawofcooling.png" width="320" /></a></div>In the case that $n = 1$, this differential equation is exceedingly easy to write down, considering the difference $(T(x+dx)-T(x))-(T(x)-T(x-dx))$ as the double-derivative upon division by $dx^2$. More rigorously, what we're doing here is applying a <b>localised version of the fundamental theorem of calculus</b>. I.e. we're writing down:<br /><br />$$\begin{align}<br />\lim_{\Delta x \to 0} \frac{1}{\Delta x}(T'(x + \Delta x) - T'(x)) &= \lim_{\Delta x \to 0} \frac{1}{{\Delta x}}\int_x^{\Delta x} {T''(x)dx} \\<br />& = T''(x)<br />\end{align}<br />$$<br />More generally, we may consider the $n$-dimensional case.<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://1.bp.blogspot.com/-HZE4Or8E8tU/XL2YeooEtuI/AAAAAAAAFfw/akx8XXDjp5clCqNjy4LzSmkFV3Fk0-KzACLcBGAs/s1600/laplace.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="522" data-original-width="605" height="276" src="https://1.bp.blogspot.com/-HZE4Or8E8tU/XL2YeooEtuI/AAAAAAAAFfw/akx8XXDjp5clCqNjy4LzSmkFV3Fk0-KzACLcBGAs/s320/laplace.png" width="320" /></a></div>Analogously to before, one may try to look at temperature flows in each direction -- here, we have an <i>integral</i>, done on the boundary of an infinitesimal region $V$ (this symbol will also represent the volume of the region):<br /><br />$$ \frac{{\partial T}}{{\partial t}} = \lim_{V \to 0} \frac{\alpha }{V}\int_{\partial V} {\hat u\,dS \cdot \vec \nabla T} $$<br />At this point, one may apply the divergence theorem, converting this to:<br /><br />$$\frac{{\partial T}}{{\partial t}} = \mathop {\lim }\limits_{V \to 0} \frac{\alpha }{V}\int\limits_V {\vec \nabla \cdot \vec \nabla T\;dV} = \alpha{\left| {\vec \nabla } \right|^2}T$$<br />In this sense, the divergence theorem is analogous to the fundamental theorem of calculus for manifolds with boundaries that are more than one-dimensional (see the bottom of the page for a link to a formalisation/an abstraction based on this analogy). But there are more ways to intuitively understand this. Note how the Laplacian is the trace of the Hessian matrix (note: we use $\vec{\nabla}^2$ to refer to the Hessian and $\left|\vec\nabla\right|^2$ to refer to the Laplacian):<br /><br />$${\left| {\vec \nabla } \right|^2}T = {\mathop{\rm tr}} \left({\vec{\nabla} ^2}T\right)$$<br />The trace of a matrix is fundamentally linked to some notion of <i>averaging</i> -- the simplest interpretation of this is that it is the mean of the eigenvalues. But more relevant to our situation, it can be shown that the trace of a matrix is the expected value of the quadratic form defined by the matrix on the unit sphere -- or on a general sphere $S$:<br /><br />$${\mathop{\rm tr}} A = \frac{1}{S}\int_S {\frac{{\Delta {x^T}A\,\Delta x}}{{\Delta {x^T}\Delta x}}\,dS} $$<br />One may check that taking the limit as $\Delta x \to 0$, substituting $\nabla^2$ for the operator and writing ${\overrightarrow \nabla ^2}f\,d\vec x = \overrightarrow \nabla f$, one gets the original "average of directional derivatives" expression.<br /><br /><div class = "twn-furtherinsight">Can you interpret the other coefficients of the characteristic polynomial in terms of statistical ideas?</div><br /><hr /><br /><div><b>Further reading:</b></div><div><ul><li>Using the "infinitesimal region" idea to define divergence, curl and Laplacian rigorously: <a href="https://www.khanacademy.org/math/multivariable-calculus/greens-theorem-and-stokes-theorem/formal-definitions-of-divergence-and-curl/a/formal-definition-of-divergence-in-two-dimensions">Khan Academy</a></li><li>An abstraction based on the "analogy" between FTC, Divergence Theorem, Navier-Stokes Theorem, etc. <a href="https://en.wikipedia.org/wiki/Stokes%27_theorem">Stokes' theorem (Wikipedia)</a></li></ul></div>calculusdivergence theoremheat equationlaplacianlinear algebraphysicsstatisticsstokes theoremtraceMon, 22 Apr 2019 11:57:00 GMTnoreply@blogger.comtag:blogger.com,1999:blog-3214648607996839529.post-6647801447602057527Abhimanyu Pallavi Sudhir2019-04-22T11:57:00ZAnswer by Abhimanyu Pallavi Sudhir for Computing the Lie bracket on the Lie group $GL(n, \mathbb{R})$
https://math.stackexchange.com/questions/1884253/computing-the-lie-bracket-on-the-lie-group-gln-mathbbr/3193887#3193887
1<p>I think the sensible way to get an intuition for this is to just look at the Taylor expansion of the group commutator:</p>
<p><span class="math-container">$$e^{\varepsilon x} e^{\varepsilon y} e^{-\varepsilon x} e^{-\varepsilon y}$$</span></p>
<p>Which to second order is <span class="math-container">$1+\varepsilon^2(xy-yx)$</span>. Presumably you know how to prove that the second derivative of the above expression is equivalent to the derivative-of-the-adjoint definition.</p>Fri, 19 Apr 2019 18:26:17 GMThttps://math.stackexchange.com/questions/1884253/-/3193887#3193887Abhimanyu Pallavi Sudhir2019-04-19T18:26:17ZAnswer by Abhimanyu Pallavi Sudhir for Determinant-like expression for non-square matrices
https://math.stackexchange.com/questions/903028/determinant-like-expression-for-non-square-matrices/3191959#3191959
0<p>See <a href="https://arxiv.org/abs/1904.08097" rel="nofollow noreferrer">1904.08097</a> for a review I authored of generalised determinant functions of tall matrices, and their properties -- this should provide a self-contained introduction to three different generalised determinants. </p>
<p>The function mentioned by Joonas Ilmavirta is the square of the "determinant-like function" that I first wrote about in 2013, albeit with an erroneous factor of <span class="math-container">$\sqrt{|m-n|!}$</span> at the front, which is corrected in the above review. It is also the norm-squared of the vector determinant, and the product of the singular values of the matrix.</p>
<p>If you want a non-trivial determinant for "wide matrices", i.e. flattenings, you will need to be a bit creative in the definition of the determinant, such as by defining it as the scaling of <span class="math-container">$m$</span>-volumes where <span class="math-container">$m$</span> is the dimension of the flattened space.</p>Thu, 18 Apr 2019 03:38:38 GMThttps://math.stackexchange.com/questions/903028/-/3191959#3191959Abhimanyu Pallavi Sudhir2019-04-18T03:38:38ZAnswer by Abhimanyu Pallavi Sudhir for Intuitive explanation of a positive semidefinite matrix
https://math.stackexchange.com/questions/9758/intuitive-explanation-of-a-positive-semidefinite-matrix/3181937#3181937
1<p>Positive-definite matrices are matrices that are <strong>congruent to the identity matrix</strong>, i.e. that can be written as <span class="math-container">$P^HP$</span> for invertible <span class="math-container">$P$</span> (for some reason, a lot of authors define congruence as <span class="math-container">$N=P^TMP$</span>, but here we go by the Hermitian definition <span class="math-container">$N=P^HMP$</span>). </p>
<p>One reason this is useful is that if two forms <span class="math-container">$M$</span> and <span class="math-container">$N$</span> are congruent, their corresponding "generalised unitary groups" <span class="math-container">$\{A^HMA=M\}$</span> and <span class="math-container">$\{B^HNB=N\}$</span> are isomorphic (via conjugation by <span class="math-container">$P$</span>). So positive-definite matrices (as well as negative-definite matrices, because <span class="math-container">$-I$</span> is preserved by the unitary group as well) define a dot product whose geometry is isomorphic to Euclidean geometry.</p>
<p>Similarly, a <strong>positive semidefinite matrix</strong> defines a geometry that Euclidean geometry is <em>homeomorphic</em> to -- to put it in slightly imprecisely, such a geometry has all the symmetries of Euclidean geometry, and perhaps then some.</p>
<p>See a fuller treatment <strong><a href="https://thewindingnumber.blogspot.com/2019/04/geometry-positive-definiteness-and.html" rel="nofollow noreferrer">here</a></strong>.</p>Wed, 10 Apr 2019 06:48:08 GMThttps://math.stackexchange.com/questions/9758/-/3181937#3181937Abhimanyu Pallavi Sudhir2019-04-10T06:48:08ZAnswer by Abhimanyu Pallavi Sudhir for Can non-linear transformations be represented as Transformation Matrices?
https://math.stackexchange.com/questions/450/can-non-linear-transformations-be-represented-as-transformation-matrices/3177854#3177854
0<p>The point of transformation matrices is that the images of the <span class="math-container">$n$</span> basis vectors is sufficient to determine the action of the entire transformation -- this is true for linear transformations, but not an arbitrary transformation.</p>
<p>However, nonlinear transformations (the smooth ones, anyway), can be locally approximated as linear transformations. With a bit of calculus, you get the "Jacobian matrix", which acts on the tangent vector space at every point on a manifold. This is a generalisation of transformation matrices in the sense that linear transformation's Jacobian is equal to its matrix representation, i.e. in the same sense that the derivative generalises the slope (which completely determines a linear function <span class="math-container">$y=mx$</span>)</p>Sun, 07 Apr 2019 06:48:31 GMThttps://math.stackexchange.com/questions/450/-/3177854#3177854Abhimanyu Pallavi Sudhir2019-04-07T06:48:31ZAnswer by Abhimanyu Pallavi Sudhir for Why does $A^TA=I, \det A=1$ mean $A$ is a rotation matrix?
https://math.stackexchange.com/questions/68119/why-does-ata-i-det-a-1-mean-a-is-a-rotation-matrix/3177807#3177807
2<p>You could just write out the components to confirm that this is so -- a much more interesting way to understand things, however, is to write down the condition as:</p>
<p><span class="math-container">$$A^TIA=I$$</span></p>
<p>The idea is that the matrix <span class="math-container">$A$</span> <em>preserves the identity quadratic form</em> -- note that <span class="math-container">$I$</span> is a quadratic form here and not a linear transformation, as this is the transformation law for quadratic forms (<span class="math-container">$A^TMA$</span> instead of <span class="math-container">$A^{-1}MA$</span>).</p>
<p>The hyperconic section corresponding to the identity quadratic form is the unit sphere -- thus the orthogonal transformations are all those that preserve the unit sphere. Another way of putting this is that <span class="math-container">$(Ax)^TI(Ay)=x^TA^TIAy=x^TIy$</span>, i.e. the Euclidean dot product <span class="math-container">$I$</span> is preserved by <span class="math-container">$A$</span>. This is equivalent to preserving the unit sphere, because the unit sphere is determined by the dot product on the given space.</p>
<p>What sort of transformations preserve the unit sphere? </p>
<hr>
<p>The reason this is a good way of understanding things is that there are plenty of other "dot products" you can define. One elementary one from physics is the Minkowski dot product in special relativity, <span class="math-container">$\mathrm{diag}(-1,1,1,1)$</span> -- the corresponding quadric surface is a hyperboloid, and the transformations that preserve it, forming the Lorentz group, are boosts (skews between time and a spatial dimension), spatial rotations and reflections.</p>
<hr>
<p>As for discriminating between rotations and reflections, suppose we define rotations in a completely geometric way -- for a matrix to be a rotation, all its eigenvalues are either 1 or in pairs of unit complex conjugates. </p>
<p>What do the eigenvalues of orthogonal matrices look like? For each eigenvalue, you need <span class="math-container">$\overline{\lambda}\lambda=1$</span>, i.e. all the eigenvalues are unit complex numbers. If a complex eigenvalue isn't paired with a corresponding conjugate, you will not get a real-valued transformation on <span class="math-container">$\mathbb{R}^n$</span>. Meanwhile if an eigenvalue of -1 isn't paired with another -1 -- i.e. if there are an odd number of reflections -- you get a reflection. The orthogonal (or rather unitary) transformations that do not behave this way are precisely the rotations.</p>
<p>The similarity between unpaired unit complex eigenvalues and unpaired -1's is interesting, by the way -- when thinking about reflections, you might have gotten the idea that reflections are <span class="math-container">$\pi$</span>-angle rotations in a higher-dimensional space -- like the vector was rotated through a higher-dimensional space and then landed on its reflection -- like it was a discrete snapshot of a process as smooth as any rotation. </p>
<p>Well, now you know what this higher-dimensional space is -- precisely <span class="math-container">$\mathbb{C}^n$</span>. And the determinant of a unitary matrix also takes a continuous spectrum -- the entire unit circle. In this sense (among other senses) complex linear algebra is more "complete" than real linear algebra.</p>Sun, 07 Apr 2019 05:55:12 GMThttps://math.stackexchange.com/questions/68119/why-does-ata-i-det-a-1-mean-a-is-a-rotation-matrix/3177807#3177807Abhimanyu Pallavi Sudhir2019-04-07T05:55:12ZAnswer by Abhimanyu Pallavi Sudhir for Reasoning about Lie theory and the Exponential Map
https://math.stackexchange.com/questions/19575/reasoning-about-lie-theory-and-the-exponential-map/3177348#3177348
0<p>The identity element <em>does</em> have significance, in the sense that it is the only natural way to think of the elements of the Lie Algebra as infinitesimal generators.</p>
<p>As I explain <a href="https://thewindingnumber.blogspot.com/2019/04/introduction-to-lie-groups.html" rel="nofollow noreferrer">here</a>, the idea is that with elements of the form <span class="math-container">$1+\varepsilon\vec\theta$</span>, elements of the group are generated as </p>
<p><span class="math-container">$$g(\vec\theta)=(1+\varepsilon\vec\theta)^{1/\varepsilon}=\exp\vec\theta$$</span></p>
<p>This map only exists when elements close to the identity are taken, as every element other than the identity is itself a generator (thus elements of the group can simply be generated via real-powers, not infinitesimally).</p>
<p><img src="https://i.stack.imgur.com/0AC5rm.png" width="500" /></p>Sat, 06 Apr 2019 19:21:58 GMThttps://math.stackexchange.com/questions/19575/-/3177348#3177348Abhimanyu Pallavi Sudhir2019-04-06T19:21:58ZAnswer by Abhimanyu Pallavi Sudhir for Binomial product expansion
https://math.stackexchange.com/questions/1331401/binomial-product-expansion/3172053#3172053
0<p>It is not a generalisation of the Binomial theorem because the exponent of <span class="math-container">$c$</span> isn't really handled -- they just took it outside. If you were to expand out the right-hand-side, you would have a generalisation of the Binomial theorem.</p>Tue, 02 Apr 2019 16:08:25 GMThttps://math.stackexchange.com/questions/1331401/-/3172053#3172053Abhimanyu Pallavi Sudhir2019-04-02T16:08:25ZAnswer by Abhimanyu Pallavi Sudhir for Intuition for the exponential of a matrix
https://math.stackexchange.com/questions/1213264/intuition-for-the-exponential-of-a-matrix/3165551#3165551
1<p>When I first learned about cyclic groups, the picture that I always had in my head was of the unit circle in the complex plane -- imagine my shock when I realised it wasn't a cyclic group at all! But I really <em>wanted</em> it to be cyclic, because it shared some really interesting properties with cyclic groups (see my post <em><a href="https://thewindingnumber.blogspot.com/2018/12/intuition-analogies-and-abstraction.html" rel="nofollow noreferrer">Intuition, analogies and abstraction</a></em>).</p>
<p>The solution to the problem can be seen directly from the quickest proof that the unit circle isn't cyclic -- the fact that it isn't countable (while the integers are). So here's an idea: let's admit <em>real powers on groups</em>!</p>
<p>Ok, but how? We know the construction of integer powers on an arbitrary group, and we know how real powers work on the unit circle, or the real line (which is also real-power cyclic*, by the way), and it's conventionally equal to <span class="math-container">$x^r=\exp(r\log x)$</span> with <span class="math-container">$\exp$</span> given by its power series expansion.</p>
<p>But sticking just to our intuition for now, it would seem like the natural way to define a real power is to introduce a real-number parameterisation to our group -- for example, the circle group can be parameterised by <span class="math-container">$\theta$</span> and each element of the group is given by some <span class="math-container">$g(\theta)$</span>. Then real powers would look like <span class="math-container">$g(\theta)^r=g(r\theta)$</span>. In the case of a one-parameter group, we also have <span class="math-container">$g(\theta_1+\theta_2)=g(\theta_1)g(\theta_2)$</span>, but don't get too attached to this.</p>
<p>If you think about it, we've now just given some <em>additional structure</em> to our group -- a geometric structure in addition to the group structure.</p>
<p>But frankly, introducing a parameterisation in this way is a bit hand-wavy. We knew what parameterisation to introduce for the circle group because we already have a picture of its geometry in our heads, but in principle, we could've introduced really any kind of ridiculous parameterisation and given it a really ugly structure and an ugly real-power. What we need is a sensible, systematic way to introduce this parameterisation -- i.e. to think about what this parameter space really <em>is</em>.</p>
<p>The answer to the question comes from Euler's formula, which relates addition on the imaginary line to multiplication on the unit circle. </p>
<p><span class="math-container">$$\exp(i\theta)=g(\theta)$$</span></p>
<p>What significance does the imaginary line have to the unit circle? Well, something interesting is that the tangent to the unit circle at 1 is parallel to the imaginary line, i.e. all its elements are of the form <span class="math-container">$1+it$</span>. So an idea for the parameterisation is that the parameter space is the tangent space at the identity of the group -- this is the Lie algebra of the group.</p>
<p>(You still need to prove that this actually works in general -- this has to do with proving that all derivatives of the exponential map at the identity can be recovered as <span class="math-container">$g^{(k)}(0)=(g'(0))^k$</span> -- this is a property of exponential functions of the form <span class="math-container">$g(t)=e^{bt}$</span>, and is part of the "exponential structure" of the Lie Algebra/Lie Group correspondence.)</p>
<p>This is not too bad! It's not completely absurd to think about the "vicinity of the identity" of at least matrix groups, so it's not absurd to think about tangent spaces to these groups. This is where you see arguments like <span class="math-container">$(1+\varepsilon t)^T(1+\varepsilon t)=1+\varepsilon(t+t^T)$</span> implying the tangent space to an Orthogonal Group is an algebra of antisymmetric matrices, etc. -- if you have some notion of perturbing an element in your group, you can construct a Lie algbera parameterisation of it.</p>
<hr>
<p>*To the best of my knowledge, "real-power cyclic" is not a real word -- the conventional term is "one-parameter Lie group".</p>
<p>See my post <a href="https://thewindingnumber.blogspot.com/2019/04/introduction-to-lie-groups.html" rel="nofollow noreferrer">Introduction to Lie groups</a> for a more complete treatment.</p>Thu, 28 Mar 2019 06:29:55 GMThttps://math.stackexchange.com/questions/1213264/-/3165551#3165551Abhimanyu Pallavi Sudhir2019-03-28T06:29:55ZAnswer by Abhimanyu Pallavi Sudhir for What's the generalisation of the quotient rule for higher derivatives?
https://math.stackexchange.com/questions/5357/whats-the-generalisation-of-the-quotient-rule-for-higher-derivatives/3131947#3131947
1<p>I'm checking @Mohammad Al Jamal's formula with SymPy, and I can verify it's true (barring a missing <span class="math-container">$(-1)^k$</span> term) for up to <span class="math-container">$n = 16$</span>, at least (it gets really slow after that).</p>
<pre>
import sympy as sp
k = sp.Symbol('k'); x = sp.Symbol('x'); f = sp.Function('f'); g = sp.Function('g')
n = 0
while True:
fgn = sp.diff(f(x) / g(x), x, n)
guess = sp.summation((-1) ** k * sp.binomial(n + 1, k + 1) \
* sp.diff(f(x) * (g(x)) ** k, x, n)/(g(x) ** (k + 1)), (k, 0, n))
print("{} for n = {}".format(sp.expand(guess - fgn) == 0, n))
n += 1
</pre>
<p>This is quite surprising to me -- I didn't expect there to be such a simple and straightforward expression for <span class="math-container">$(f(x)/g(x))^{(n)}$</span>, and haven't seen his formula anywhere before. I tried some inductive proofs, but I haven't succeeded in proving it yet.</p>Sat, 02 Mar 2019 00:15:06 GMThttps://math.stackexchange.com/questions/5357/-/3131947#3131947Abhimanyu Pallavi Sudhir2019-03-02T00:15:06ZAnswer by Abhimanyu Pallavi Sudhir for Why didn't Lorentz conclude that no object can go faster than light?
https://physics.stackexchange.com/questions/461833/why-didnt-lorentz-conclude-that-no-object-can-go-faster-than-light/461863#461863
12<p>Because typically if you find an expression that seems to break down at some value of <span class="math-container">$v$</span>, you would conclude that the expression simply loses its validity for that value of <span class="math-container">$v$</span>, not that the value isn't attainable. Presumably this was the conclusion of Lorentz and others.</p>
<p>The reason Einstein concluded otherwise is that special relativity gives a physical argument for "superluminal speeds are equivalent to time running backwards" -- the argument is "does a superluminal ship hit the iceberg before or after its headlight does?" </p>
<p>This depends on the observer, and because the headlight would melt the iceberg, the consequences of each observation are noticeably different. The only possible conclusions are "superluminal ships don't exist", "time runs backwards for superluminal observers", or "iceberg-melting headlights don't exist".</p>Wed, 20 Feb 2019 10:43:02 GMThttps://physics.stackexchange.com/questions/461833/-/461863#461863Abhimanyu Pallavi Sudhir2019-02-20T10:43:02ZAnswer by Abhimanyu Pallavi Sudhir for What kind of matrices are non-diagonalizable?
https://math.stackexchange.com/questions/472915/what-kind-of-matrices-are-non-diagonalizable/3097881#3097881
4<p><strong>Edit:</strong> The algebra I speak of here is <em>not</em> actually the Grassmann numbers at all -- they are <span class="math-container">$\mathbb{R}[X]/(X^n)$</span>, whose generators <em>don't</em> satisfy the anticommutativity relation even though they satisfy all the nilpotency relations. The dual-number stuff for 2 by 2 is still correct, just ignore my use of the word "Grassmann".</p>
<hr>
<p>Non-diagonalisable 2 by 2 matrices can be diagonalised over the <a href="https://en.wikipedia.org/wiki/Dual_number" rel="nofollow noreferrer">dual numbers</a> -- and the "weird cases" like the Galilean transformation are not fundamentally different from the nilpotent matrices.</p>
<p>The intuition here is that the Galilean transformation is sort of a "boundary case" between real-diagonalisability (skews) and complex-diagonalisability (rotations) (which you can sort of think in terms of discriminants). In the case of the Galilean transformation <span class="math-container">$\left[\begin{array}{*{20}{c}}{1}&{v}\\{0}&{1}\end{array}\right]$</span>, it's a small perturbation away from being diagonalisable, i.e. it sort of has "repeated eigenvectors" (you can visualise this with <a href="https://shadanan.github.io/MatVis/" rel="nofollow noreferrer">MatVis</a>). So one may imagine that the two eigenvectors are only an "epsilon" away, where <span class="math-container">$\varepsilon$</span> is the unit dual satisfying <span class="math-container">$\varepsilon^2=0$</span> (called the "soul"). Indeed, its characteristic polynomial is:</p>
<p><span class="math-container">$$(\lambda-1)^2=0$$</span></p>
<p>Whose solutions among the dual numbers are <span class="math-container">$\lambda=1+k\varepsilon$</span> for real <span class="math-container">$k$</span>. So one may "diagonalise" the Galilean transformation over the dual numbers as e.g.:</p>
<p><span class="math-container">$$\left[\begin{array}{*{20}{c}}{1}&{0}\\{0}&{1+v\varepsilon}\end{array}\right]$$</span></p>
<p>Granted this is not unique, this is formed from the change-of-basis matrix <span class="math-container">$\left[\begin{array}{*{20}{c}}{1}&{1}\\{0}&{\epsilon}\end{array}\right]$</span>, but any vector of the form <span class="math-container">$(1,k\varepsilon)$</span> is a valid eigenvector. You could, if you like, consider this a canonical or "principal value" of the diagonalisation, and in general each diagonalisation corresponds to a limit you can take of real/complex-diagonalisable transformations. Another way of thinking about this is that there is an entire eigenspace spanned by <span class="math-container">$(1,0)$</span> and <span class="math-container">$(1,\varepsilon)$</span> in that little gap of multiplicity. In this sense, the geometric multiplicity is forced to be equal to the algebraic multiplicity*.</p>
<p>Then a nilpotent matrix with characteristic polynomial <span class="math-container">$\lambda^2=0$</span> has solutions <span class="math-container">$\lambda=k\varepsilon$</span>, and is simply diagonalised as:</p>
<p><span class="math-container">$$\left[\begin{array}{*{20}{c}}{0}&{0}\\{0}&{\varepsilon}\end{array}\right]$$</span></p>
<p>(Think about this.) Indeed, the resulting matrix has minimal polynomial <span class="math-container">$\lambda^2=0$</span>, and the eigenvectors are as before.</p>
<hr>
<p>What about higher dimensional matrices? Consider:</p>
<p><span class="math-container">$$\left[ {\begin{array}{*{20}{c}}0&v&0\\0&0&w\\0&0&0\end{array}} \right]$$</span></p>
<p>This is a nilpotent matrix <span class="math-container">$A$</span> satisfying <span class="math-container">$A^3=0$</span> (but not <span class="math-container">$A^2=0$</span>). The characteristic polynomial is <span class="math-container">$\lambda^3=0$</span>. Although <span class="math-container">$\varepsilon$</span> might seem like a sensible choice, it doesn't really do the trick -- if you try a diagonalisation of the form <span class="math-container">$\mathrm{diag}(0,v\varepsilon,w\varepsilon)$</span>, it has minimal polynomial <span class="math-container">$A^2=0$</span>, which is wrong. Indeed, you won't be able to find three linearly independent eigenvectors to diagonalise the matrix this way -- they'll all take the form <span class="math-container">$(a+b\varepsilon,0,0)$</span>.</p>
<p>Instead, you need to consider a generalisation of the dual numbers, called the Grassmann numbers, with the soul satisfying <span class="math-container">$\epsilon^n=0$</span>. Then the diagonalisation takes for instance the form:</p>
<p><span class="math-container">$$\left[ {\begin{array}{*{20}{c}}0&0&0\\0&{v\epsilon}&0\\0&0&{w\epsilon}\end{array}} \right]$$</span></p>
<hr>
<p>*Over the reals and complexes, when one defines algebraic multiplicity (as "the multiplicity of the corresponding factor in the characteristic polynomial"), there is a single eigenvalue corresponding to that factor. This is of course no longer true over the Grassmann numbers, because they are not a field, and <span class="math-container">$ab=0$</span> no longer implies "<span class="math-container">$a=0$</span> or <span class="math-container">$b=0$</span>".</p>
<p>In general, if you want to prove things about these numbers, the way to formalise them is by constructing them as the quotient <span class="math-container">$\mathbb{R}[X]/(X^n)$</span>, so you actually have something clear to work with.</p>
<p>(Perhaps relevant: <a href="https://math.stackexchange.com/questions/46078/grassmann-numbers-as-eigenvalues-of-nilpotent-operators">Grassmann numbers as eigenvalues of nilpotent operators?</a> -- discussing the fact that the Grassmann numbers are not a field).</p>
<p>You might wonder if this sort of approach can be applicable to LTI differential equations with repeated roots -- after all, their characteristic matrices are exactly of this Grassmann form. As pointed out in the comments, however, this diagonalisation is still not via an invertible change-of-basis matrix, it's still only of the form <span class="math-container">$PD=AP$</span>, not <span class="math-container">$D=P^{-1}AP$</span>. I don't see any way to bypass this. See my posts <a href="https://thewindingnumber.blogspot.com/2019/02/all-matrices-can-be-diagonalised.html" rel="nofollow noreferrer">All matrices can be diagonalised</a> (a re-post of this answer) and <a href="https://thewindingnumber.blogspot.com/2018/03/repeated-roots-of-differential-equations.html" rel="nofollow noreferrer">Repeated roots of differential equations</a> for ideas, I guess.</p>Sat, 02 Feb 2019 22:17:56 GMThttps://math.stackexchange.com/questions/472915/-/3097881#3097881Abhimanyu Pallavi Sudhir2019-02-02T22:17:56ZAnswer by Abhimanyu Pallavi Sudhir for Relativity from a basic assumption
https://physics.stackexchange.com/questions/455712/relativity-from-a-basic-assumption/455753#455753
1<p>I will give <em>a</em> derivation of the Lorentz boosts requiring (what at least seem to be) minimal assumptions, and we will look at what assumptions we used, and see if some of them can be derived from each other, etc. Note that by "the Lorentz transformations", I mean the Lorentz transformation of spacetime position -- Lorentz transformations of other four-touples (i.e. proving that they are Lorentz vectors) would require other assumptions, of course. I've given a more full explanation of the derivation <a href="https://thewindingnumber.blogspot.com/2017/09/introduction-to-special-relativity.html" rel="nofollow noreferrer">here</a>.</p>
<p><strong>(a)</strong> The first important fact you need to prove anything about the Lorentz transformations is that they are linear. Linearity is logically equivalent to the following conditions: (under the transformation),</p>
<ul>
<li><strong>all straight lines remain straight lines</strong> -- the physical interpretation of this is that if an object's velocity is constant in one inertial reference frame, it is constant in all inertial reference frames. This follows from the <em>principle of relativity</em>.</li>
<li><strong>the origin remains fixed</strong> -- this is true by definition of the transformations we are considering -- boosts passing through the same origin.</li>
</ul>
<p>With this, we know that we can use a matrix to write down the Lorentz transformations. Which matrix?</p>
<p><strong>(b)</strong> The tilt/angle of the <span class="math-container">$t'$</span>, <span class="math-container">$x'$</span> axes with respect to the <span class="math-container">$t$</span>, <span class="math-container">$x$</span> axes. The tilt of the <span class="math-container">$t'$</span> axes follows from the definition of velocity as the gradient of the worldline. To prove the tilt of the <span class="math-container">$x'$</span> axis is equal to this tilt, we need to first define the <span class="math-container">$x'$</span> axis within the unprimed co-ordinate system. </p>
<p>This is possible by considering invariant features under a boost, i.e. from the principle of relativity -- the obvious invariant is as follows: if you had emitted a light ray <span class="math-container">$a$</span> seconds in the past, it reflects off some object and returns to you <span class="math-container">$a$</span> seconds in the future, it was on your x-axis at time 0.</p>
<p><a href="https://i.stack.imgur.com/zC7TS.png" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/zC7TS.png" alt="enter image description here"></a></p>
<p>By the principle of relativity, this should apply in the primed reference frame as well. By the invariance of the speed of light, the slope of the light ray is the same in the primed reference frame. Now figuring out the angle of tilt of the <span class="math-container">$x'$</span> axis becomes an exercise in geometry.</p>
<p><a href="https://i.stack.imgur.com/QvRjN.png" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/QvRjN.png" alt="enter image description here"></a></p>
<p>And it's easy to prove, by drawing an appropriate circle, that the two tilts are equal.</p>
<p><strong>(c)</strong> We now know the lines the column vectors of our matrix land on -- they are multiples of <span class="math-container">$(1, v)$</span> and <span class="math-container">$(v, 1)$</span>, but which vector on that line exactly? In other words, what's the scale on the axes? This requires one extra assumption: if you boost into the frame with velocity <span class="math-container">$v$</span>, then boost <span class="math-container">$-v$</span> back, that's equivalent to not boosting at all, i.e. <span class="math-container">$L(v)L(-v)=I$</span>. Then it's just computation:</p>
<p><span class="math-container">\begin{gathered}
\left[ {\begin{array}{*{20}{c}}
1&0 \\
0&1
\end{array}} \right] = \left[ {\begin{array}{*{20}{c}}
\alpha &{\beta v} \\
{\alpha v}&\beta
\end{array}} \right]\left[ {\begin{array}{*{20}{c}}
\alpha &{ - \beta v} \\
{ - \alpha v}&\beta
\end{array}} \right] = \left[ {\begin{array}{*{20}{c}}
{{\alpha ^2} - \alpha \beta {v^2}}&{{\beta ^2}v - \alpha \beta v} \\
{{\alpha ^2}v - \alpha \beta v}&{{\beta ^2} - \alpha \beta {v^2}}
\end{array}} \right] \hfill \\
{\alpha ^2}v - \alpha \beta v = 0 = {\beta ^2}v - \alpha \beta v \Rightarrow {\alpha ^2} = \alpha \beta = {\beta ^2} \Rightarrow \alpha = \beta \hfill \\
{\alpha ^2} - \alpha \beta {v^2} = 1 = {\beta ^2} - \alpha \beta {v^2} \Rightarrow {\alpha ^2} = 1 + \alpha \beta {v^2} = {\beta ^2} \Rightarrow {\alpha ^2} = 1 + {\alpha ^2}{v^2} \hfill \\
\Rightarrow \alpha = \beta = \frac{1}{{\sqrt {1 - {v^2}} }} \hfill \\
\end{gathered}</span></p>
<p>Then the change of basis matrix is simply the inverse of this matrix, which is:</p>
<p><span class="math-container">$$\Lambda=\gamma \left[ {\begin{array}{*{20}{c}}
1&-v \\
-v&1
\end{array}} \right]$$</span></p>
<p>Or:</p>
<p><span class="math-container">\begin{gathered}
x' = \gamma \left( {x - vt} \right) \\
t' = \gamma \left( {t - vx} \right) \\
\end{gathered}</span></p>
<p><strong>(d)</strong> There's still one final step, however -- we need to verify that <span class="math-container">$y$</span> and <span class="math-container">$z$</span> aren't transformed under the Lorentz boost. To prove this, consider two twins with paintbrushes running towards each other, painting the wall at waist level -- if the orthogonal axis were transformed in any way, each twin would see his paint-streak as above the other's -- the fact that the paint-streaks' relative positioning can't be different can be seen, e.g. from supposing that the two paints cause an explosion in the mix. The fact that the presence of explosions (or any boolean quantity) is invariant under Lorentz transformations is a consequence of the principle of relativity.</p>
<hr>
<p>We used three physical assumptions:</p>
<ul>
<li>The principle of relativity</li>
<li>The invariance of the speed of light</li>
<li><span class="math-container">$L(v)L(-v)=L(0)$</span>, or "if I see you moving at <span class="math-container">$v$</span>, you see me moving at <span class="math-container">$-v$</span>"</li>
</ul>
<p>The first two are the assumptions you wanted. As far as I can see, the last assumption can't really be proven from the other two -- it requires some sort of symmetry principle. But that's okay.</p>Mon, 21 Jan 2019 22:48:47 GMThttps://physics.stackexchange.com/questions/455712/-/455753#455753Abhimanyu Pallavi Sudhir2019-01-21T22:48:47ZAnswer by Abhimanyu Pallavi Sudhir for Varying constants in special relativity
https://physics.stackexchange.com/questions/455159/varying-constants-in-special-relativity/455176#455176
1<blockquote>
<p>(presumably) everything has mass, there is no such thing as a perfect inertial frame of reference</p>
</blockquote>
<p>This isn't right. "There isn't generally a perfectly flat co-ordinate system" does not imply everything has mass, and being an inertial reference frame has nothing to do with the associated observer's mass (in fact, the Lorentz transformation associated with a photon's "co-ordinate system" is singular, so there isn't really a co-ordinate system/reference frame associated with it).</p>
<p>I guess your concern is with the fact that photons are affected by spacetime curvature -- this is true, but the/a point of general relativity is that this doesn't imply anything about the mass.</p>Fri, 18 Jan 2019 20:30:09 GMThttps://physics.stackexchange.com/questions/455159/-/455176#455176Abhimanyu Pallavi Sudhir2019-01-18T20:30:09ZAnswer by Abhimanyu Pallavi Sudhir for How to understand Cantor's diagonalization method in proving the uncountability of the real numbers?
https://math.stackexchange.com/questions/2855987/how-to-understand-cantors-diagonalization-method-in-proving-the-uncountability/3064377#3064377
0<p>Here's a perhaps more fathomable way to phrase what everyone has already said: even if you were to include <span class="math-container">$\infty$</span> as an integer, it would be just one integer. On the other hand, counting 4142... and 1088... separately means you're adding a much larger number of infinities to your set. </p>
<p>How many numbers exactly? There are ten choices for each digit, there an infinite number of digits so indeed you're adding <span class="math-container">$10^{\aleph_0}=2^{\aleph_0}$</span> numbers to your integers, which is precisely the cardinality of the reals.</p>Sun, 06 Jan 2019 20:49:32 GMThttps://math.stackexchange.com/questions/2855987/-/3064377#3064377Abhimanyu Pallavi Sudhir2019-01-06T20:49:32ZAnswer by Abhimanyu Pallavi Sudhir for What is really curved, spacetime, or simply the coordinate lines?
https://physics.stackexchange.com/questions/290906/what-is-really-curved-spacetime-or-simply-the-coordinate-lines/452416#452416
0<p>Curved co-ordinates on flat spacetime correspond to accelerating observers, not gravity. </p>
<p>The first physical insight of general relativity is that when you have gravity, you have <em>no</em> globally inertial frames -- contrast this with flat space, where you can always construct a linear co-ordinate system. The second physical insight is that you do have locally inertial frames, specifically the freefalling ones -- this is the "equivalence principle" -- so the manifold you use to model spacetime must necessarily have local flatness. Consequently, (pseudo-)Riemannian manifolds become the right way to model spacetime in general relativity.</p>
<p>This is why Christoffel symbols exist for accelerating observers on flat spacetime too -- they're first-order in the derivatives of the metric, and so can be eliminated by transforming into a flat co-ordinate system where the metric is constant (this is okay because the Christoffel symbols aren't tensors). The Riemann curvature tensor, on the other hand, is second-order in the derivatives of the metric and cannot be eliminated by a co-ordinate transformation.</p>Sun, 06 Jan 2019 12:13:03 GMThttps://physics.stackexchange.com/questions/290906/-/452416#452416Abhimanyu Pallavi Sudhir2019-01-06T12:13:03ZAnswer by Abhimanyu Pallavi Sudhir for Relative velocity greater than speed of light
https://physics.stackexchange.com/questions/452078/relative-velocity-greater-than-speed-of-light/452100#452100
0<p>Velocity is definitonally the same as "relative velocity". This is the point of the first postulate of relativity.</p>Fri, 04 Jan 2019 15:11:44 GMThttps://physics.stackexchange.com/questions/452078/-/452100#452100Abhimanyu Pallavi Sudhir2019-01-04T15:11:44ZAnswer by Abhimanyu Pallavi Sudhir for Does spacetime position not form a four-vector?
https://physics.stackexchange.com/questions/192886/does-spacetime-position-not-form-a-four-vector/450137#450137
0<p>Right -- vectors in general relativity live in some tangent space. This is the point of differential geometry, and of calculus in general -- you approximate non-linear things, which are <em>not</em> vector spaces (like curvy manifolds) with linear things (like their tangent spaces), which are vector spaces. This is exactly the motivation for defining the basis vectors as <span class="math-container">$\partial_\mu$</span>, as you describe.</p>Mon, 24 Dec 2018 07:04:21 GMThttps://physics.stackexchange.com/questions/192886/-/450137#450137Abhimanyu Pallavi Sudhir2018-12-24T07:04:21ZAnswer by Abhimanyu Pallavi Sudhir for What is an event in Special Relativity?
https://physics.stackexchange.com/questions/389488/what-is-an-event-in-special-relativity/444892#444892
1<p>It is perfectly reasonable to say that an event is a point in spacetime and that spacetime is a collection of events -- it is not "circular" as you claim in the comments. This is just the physics version of "a vector is an element of a vector space" and "a vector space is a set of vectors". You have axioms in math, and you have axioms in physics. The only difference is that in math, the objects are abstract, but in physics, they have a physical interpretation.</p>Mon, 03 Dec 2018 16:51:40 GMThttps://physics.stackexchange.com/questions/389488/-/444892#444892Abhimanyu Pallavi Sudhir2018-12-03T16:51:40ZAnswer by Abhimanyu Pallavi Sudhir for Why is the scalar product of two four-vectors Lorentz-invariant?
https://physics.stackexchange.com/questions/442119/why-is-the-scalar-product-of-two-four-vectors-lorentz-invariant/442164#442164
2<p>Here's the way to think about this -- why is the standard Euclidean dot product, <span class="math-container">$\sum x_iy_i$</span> interesting? Well, it is interesting primarily from the perspective of rotations, due to the fact that rotations leave dot products invariant. The reason this is so is that this dot product can be written as <span class="math-container">$|x||y|\cos\Delta\theta$</span>, and rotations leave magnitudes and relative angles invariant.</p>
<p>Is the standard Euclidean norm <span class="math-container">$|x|$</span> invariant under Lorentz transformations? Of course not -- for instance, <span class="math-container">$\Delta t^2+\Delta x^2$</span> is clearly not invariant, but <span class="math-container">$\Delta t^2-\Delta x^2$</span> is. Similarly, <span class="math-container">$E^2+p^2$</span> is not important, but <span class="math-container">$E^2-p^2$</span> is. The reason this is the case is that Lorentz boosts are fundamentally skew transformations, which means the invariant locus is a hyperbola, not a circle. So you have <span class="math-container">$\cosh^2 \xi - \sinh^2 \xi = 1$</span>, and <span class="math-container">$x_0^2-x_1^2$</span> is the right way to think of the norm on Minkowski space.</p>
<p>Similarly, Lorentz boosts change the rapidity <span class="math-container">$\xi$</span> by a simple displacement, so <span class="math-container">$\Delta \xi$</span> is invariant. From this point, it's a simple exercise to show that </p>
<p><span class="math-container">$$|x||y|\cosh\xi=x_0y_0-x_1y_1$$</span></p>
<p>(as for the remaining dimensions -- remember that the standard Euclidean dot product is still relevant in <em>space</em>, so you just need to write <span class="math-container">$x_0y_0-x\cdot y=x_0y_0-x_1y_1-x_2y_2-x_3y_3$</span>.)</p>Tue, 20 Nov 2018 15:59:18 GMThttps://physics.stackexchange.com/questions/442119/-/442164#442164Abhimanyu Pallavi Sudhir2018-11-20T15:59:18ZComment by Abhimanyu Pallavi Sudhir on Mate in 0 moves
https://puzzling.stackexchange.com/questions/74086/mate-in-0-moves/74093#74093
@FabianRöling Pawns have directions.Mon, 22 Oct 2018 09:27:58 GMThttps://puzzling.stackexchange.com/questions/74086/mate-in-0-moves/74093?cid=221467#74093Abhimanyu Pallavi Sudhir2018-10-22T09:27:58ZAnswer by Abhimanyu Pallavi Sudhir for Newton's Third Law and conservation of momentum
https://physics.stackexchange.com/questions/435941/newtons-third-law-and-conservation-of-momentum/436015#436015
3<p>As far as the actual physics is concerned, it is meaningless to talk of whether conservation of momentum is "more fundamental" than Newton's third law -- you can axiomatise classical physics in either way -- from Newton's laws, from conservation laws, from symmetry laws, from an action principle, whatever. You can prove the resulting theories are equivalent, in the sense that all the alternative axiomatic systems imply each other.</p>
<p>In terms of understanding, it makes sense to have multiple different frameworks in your head -- a symmetry-based framework is really good intuitively, especially once you understand Noether's theorem, while an action principle is the most powerful and also more useful when you leave the realm of classical physics. Treating Newton's laws as axioms isn't a great idea -- it's mostly just historically relevant.</p>
<p>When you learn more advanced physics, conservation of momentum <em>will</em> start "feeling" more fundamental -- this is simply because momentum is an interesting quantity to talk about.</p>Sun, 21 Oct 2018 21:20:57 GMThttps://physics.stackexchange.com/questions/435941/-/436015#436015Abhimanyu Pallavi Sudhir2018-10-21T21:20:57ZDerive $P \to \neg \neg P$ in a structure with not and implies
https://math.stackexchange.com/questions/2962525/derive-p-to-neg-neg-p-in-a-structure-with-not-and-implies
5<p>We can define an abstract system with the following three axiom schemes that define <span class="math-container">$\to$</span> and <span class="math-container">$\lnot$</span> as follows:</p>
<p>ax1. <span class="math-container">$P\to(Q\to P)$</span></p>
<p>ax2. <span class="math-container">$(\lnot Q \to \lnot P)\to(P\to Q)$</span></p>
<p>ax3. <span class="math-container">$(P\to(Q\to R))\to((P\to Q)\to(P\to R))$</span></p>
<p>And any logical expressions may be substituted for <span class="math-container">$P, Q, R$</span>. Now obviously, you can't assume anything else (not even any definition of <span class="math-container">$\lnot$</span>, etc.) these two objects <span class="math-container">$\to$</span> and <span class="math-container">$\lnot$</span> need not have anything at all to do with the standard implication and negation we know of, they just happen to satisfy the above properties. But with the above, we can prove basic "logical laws" like:</p>
<p><span class="math-container">$$P\to P$$</span></p>
<p>(which we can prove by applying ax3 on ax1, showing that <span class="math-container">$(P\to Q)\to(P\to P)$</span>, so we just need to construct a <span class="math-container">$Q$</span> for any <span class="math-container">$P$</span> to imply, and such a <span class="math-container">$Q$</span> is provided by ax1, <span class="math-container">$Q:= (R\to P)$</span>.)</p>
<p>Is it possible to prove this? :--</p>
<p><span class="math-container">$$P\to \lnot\lnot P$$</span></p>
<p>The person who gave me this problem insists it is provable, although it seems to me that such a proof is impossible, as none of the axioms increase the depth of <span class="math-container">$\lnot$</span>s across the <span class="math-container">$\to$</span> (i.e. none of them have a more knotty right-hand-side than left-hand-side).</p>logicpropositional-calculusaxiomshilbert-calculusFri, 19 Oct 2018 19:51:34 GMThttps://math.stackexchange.com/q/2962525Abhimanyu Pallavi Sudhir2018-10-19T19:51:34ZAnswer by Abhimanyu Pallavi Sudhir for If force is a vector, then why is pressure a scalar?
https://physics.stackexchange.com/questions/429998/if-force-is-a-vector-then-why-is-pressure-a-scalar/430008#430008
4<p>Pressure is a scalar because it does not behave as a vector -- specifically, you can't take the "components" of pressure and take their Pythagorean sum to obtain its magnitude. Instead, pressure is actually proportional to the <em>sum</em> of the components, <span class="math-container">$(P_x+P_y+P_z)/3$</span>.</p>
<p>The way to understand pressure is in terms of the stress tensor, and pressure is equal to the trace of the stress tensor. Once you understand this, the question becomes equivalent to questions like "why is the dot product a scalar?" (trace of the tensor product), "why is the divergence of a vector field a scalar?" (trace of the tensor derivative), etc. </p>
<p>There is no physical significance to taking the diagonal components of a tensor and putting them in a vector -- there <em>is</em> a physical significance to adding them up, and the invariance properties of the result tells you that it is a scalar.</p>
<p>See also: <a href="https://physics.stackexchange.com/questions/186045/why-do-we-need-both-dot-product-and-cross-product/419873#419873">Why do we need both dot product and cross product?</a></p>Fri, 21 Sep 2018 08:57:17 GMThttps://physics.stackexchange.com/questions/429998/-/430008#430008Abhimanyu Pallavi Sudhir2018-09-21T08:57:17ZAnswer by Abhimanyu Pallavi Sudhir for How can the solutions to equations of motion be unique if it seems the same state can be arrived at through different histories?
https://physics.stackexchange.com/questions/426445/how-can-the-solutions-to-equations-of-motion-be-unique-if-it-seems-the-same-stat/426453#426453
1<p>"The jar is empty at present" just tells you $f(0)$. You also need $f'(0)$, $f''(0)$, etc.</p>Mon, 03 Sep 2018 09:46:25 GMThttps://physics.stackexchange.com/questions/426445/-/426453#426453Abhimanyu Pallavi Sudhir2018-09-03T09:46:25ZAnswer by Abhimanyu Pallavi Sudhir for From the speed of light being an invariant to being the maximum possible speed
https://physics.stackexchange.com/questions/331119/from-the-speed-of-light-being-an-invariant-to-being-the-maximum-possible-speed/423423#423423
0<p>A simple thought experiment does the trick -- consider a train moving faster than light, and it has headlights (it's a glass train). According to a stationery observer (stationery in a reference frame where the train is faster than light), the train must always be in front of the light, but according to an observer hanging out of the train, the light must be in front of him, since light speed is still $c$.</p>
<p>It might not seem like this relativeness of the order of the two objects is a problem, but it is -- say, for instance, the train is moving towards a high-tech wall which is trained to do this when switched ON:
(1) if hit by a train, make world explode
(2) if light is incident, switch OFF.
The wall is currently switched ON. According to one observer, the world explodes, whereas according to another, it doesn't. This is an inconsistency.</p>
<p>Why wouldn't this argument apply to <em>any</em> speed and prohibit all motion? For example, why can't the wall be programmed to switch off a certain amount of time after which light is incident? Relativity says this is okay, because time can dilate and transform scale between reference frames. </p>
<p>But in order to make FTL speeds okay, you need to allow time to flip direction -- this is why the real condition is "to go faster than light, you must forgo causality", or simply, "locality = causality".</p>Sat, 18 Aug 2018 12:28:48 GMThttps://physics.stackexchange.com/questions/331119/-/423423#423423Abhimanyu Pallavi Sudhir2018-08-18T12:28:48ZAnswer by Abhimanyu Pallavi Sudhir for Link between Special relativity and Newtons gravitational law
https://physics.stackexchange.com/questions/123243/link-between-special-relativity-and-newtons-gravitational-law/423379#423379
0<p>Consider three theories:</p>
<p>$$L_A=1$$
$$L_B=1+h$$
$$L_C=1+h+h^2$$</p>
<p>Theory A is a special case of Theory C when $h$ is small, Theory B is a special case of C when $h$ is small, doesn't this mean A and B are the same?</p>
<p>This is not a perfect analogy, but an example as to why this sort of reasoning breaks down.</p>Sat, 18 Aug 2018 07:13:36 GMThttps://physics.stackexchange.com/questions/123243/-/423379#423379Abhimanyu Pallavi Sudhir2018-08-18T07:13:36ZAnswer by Abhimanyu Pallavi Sudhir for Why is velocity defined as 4-vector in relativity?
https://physics.stackexchange.com/questions/423360/why-is-velocity-defined-as-4-vector-in-relativity/423364#423364
5<p>"It should transform like a four-vector under a Lorentz transformation" is a generalisation of several intuitions you typically have regarding how natural objects/tensors should behave in special relativity -- an obvious one is "no special status to any individual dimension, since space and time are inherently symmetric. That $dx^\mu/dx^0$ doesn't transform like a four-vector is obvious from the fact that it gives special preference to time.</p>
<p>The conventional way to define four-velocity in relativity is as $dx^\mu/ds$. Your 2-tensor idea is cute -- it is similar to the angle tensor generalised to four-dimensions -- but it doesn't satisfy the uses we have of the standard four-velocity (e.g. how would the four-momentum be defined? $m\,dx^\mu/dx^\nu$? That wouldn't be conserved.)</p>Sat, 18 Aug 2018 06:11:14 GMThttps://physics.stackexchange.com/questions/423360/why-is-velocity-defined-as-4-vector-in-relativity/423364#423364Abhimanyu Pallavi Sudhir2018-08-18T06:11:14ZComment by Abhimanyu Pallavi Sudhir on Ubuntu 17.04 Chromium Browser quietly provides full access to Google account
https://askubuntu.com/questions/915556/ubuntu-17-04-chromium-browser-quietly-provides-full-access-to-google-account
Me too. This is weird. Even if it's just the Chrome browser, I don't see why they'd need <i>full</i> access to my Google account. Windows doesn't do this.Sat, 14 Jul 2018 17:11:46 GMThttps://askubuntu.com/questions/915556/ubuntu-17-04-chromium-browser-quietly-provides-full-access-to-google-account?cid=1726608Abhimanyu Pallavi Sudhir2018-07-14T17:11:46ZComment by Abhimanyu Pallavi Sudhir on How to create folder shortcut in Ubuntu 14.04?
https://askubuntu.com/questions/486461/how-to-create-folder-shortcut-in-ubuntu-14-04/691976#691976
@jave.web Yes -- use the application menu (either at the top left of your screen or a colourful icon next to the window controls) to go to your Nautilus preferences, then under "Behavior" enable link creation.Fri, 13 Jul 2018 11:49:02 GMThttps://askubuntu.com/questions/486461/how-to-create-folder-shortcut-in-ubuntu-14-04/691976?cid=1724793#691976Abhimanyu Pallavi Sudhir2018-07-13T11:49:02ZComment by Abhimanyu Pallavi Sudhir on How to customize (add/remove folders/directories) the "Places" menu of Ubuntu 13.04 "Files" application?
https://askubuntu.com/questions/285313/how-to-customize-add-remove-folders-directories-the-places-menu-of-ubuntu-13/292727#292727
This works. If you also want to remove the folders from the home directory, edit user-dirs.defaults, or make a copy of it in .config and edit there (for your local user).Sat, 30 Jun 2018 09:00:13 GMThttps://askubuntu.com/questions/285313/how-to-customize-add-remove-folders-directories-the-places-menu-of-ubuntu-13/292727?cid=1716388#292727Abhimanyu Pallavi Sudhir2018-06-30T09:00:13ZComment by Abhimanyu Pallavi Sudhir on How to safely remove default folders?
https://askubuntu.com/questions/140148/how-to-safely-remove-default-folders/140964#140964
See <a href="https://askubuntu.com/questions/285313/how-to-customize-add-remove-folders-directories-the-places-menu-of-ubuntu-13">here</a> for a working solution. If you also want to remove the folders from the home directory, edit user-dirs.defaults, or make a copy of it in .config and edit there (for your local user).Mon, 25 Jun 2018 06:08:26 GMThttps://askubuntu.com/questions/140148/how-to-safely-remove-default-folders/140964?cid=1713336#140964Abhimanyu Pallavi Sudhir2018-06-25T06:08:26ZComment by Abhimanyu Pallavi Sudhir on How to safely remove default folders?
https://askubuntu.com/questions/140148/how-to-safely-remove-default-folders/140964#140964
Doesn't work -- even if you don't run the update command, it gets updated upon the next reboot. There must be a more fundamental file in which these directory names are kept.Mon, 25 Jun 2018 05:32:16 GMThttps://askubuntu.com/questions/140148/how-to-safely-remove-default-folders/140964?cid=1713326#140964Abhimanyu Pallavi Sudhir2018-06-25T05:32:16ZComment by Abhimanyu Pallavi Sudhir on Explaining the Main Ideas of Proof before Giving Details
https://mathoverflow.net/questions/301085/explaining-the-main-ideas-of-proof-before-giving-details
Because good proofs are just a formalisation of the intuitive understanding -- rather than wasting space explaining the insights, you can just give them the proof, and an even somewhat experienced reader can re-create the details.Sun, 27 May 2018 04:28:36 GMThttps://mathoverflow.net/questions/301085/explaining-the-main-ideas-of-proof-before-giving-details?cid=750004Abhimanyu Pallavi Sudhir2018-05-27T04:28:36ZAnswer by Abhimanyu Pallavi Sudhir for Intuition behind speciality of symmetric matrices
https://math.stackexchange.com/questions/1788911/intuition-behind-speciality-of-symmetric-matrices/2780461#2780461
3<p>When you were first learning about null spaces in linear algebra, your guess for the null space -- assuming you had some reasonable geometric intuition into the field -- was that the null space was orthogonal to the column space. After all, that makes sense. If your singular transformation collapses/projects <span class="math-container">$\mathbb{R}^2$</span> into a line, then the vectors that get mapped to the origin are the ones perpendicular to the column space.</p>
<p><a href="https://i.stack.imgur.com/2tgry.png" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/2tgry.png" alt="Is the column space perpendicular to the row space?"></a></p>
<p>Or at least, so it seems -- in reality, though, the projection doesn't need to be so nice and orthogonal. You could, for instance, <em>rotate</em> all vectors in the space by some angle and then collapse it onto a line.</p>
<p>It turns out the null space isn't perpendicular to the column space, but in fact to the <em>row space</em> instead -- these two spaces are only identical for matrices which do not perform a rotation.</p>
<p>This is a very important observation, because it tells you something about the character of matrices -- <strong>asymmetry in a matrix is a measure of how rotation-ish it is</strong>. Specifically, an antisymmetric matrix is the result of 90-degree rotations (like imaginary numbers) and a symmetric matrix is the result of scaling and skews (like real numbers). </p>
<p><span class="math-container">$$A = \underbrace {\frac{1}{2}(A + {A^T})}_{\scriptstyle{\rm{symmetric }}\atop\scriptstyle{\rm{part}}} + \underbrace {\frac{1}{2}(A - {A^T})}_{\scriptstyle{\rm{antisymmetric }}\atop\scriptstyle{\rm{part}}}$$</span></p>
<p>All matrices can bet written as the sum of these two kinds -- a symmetric part and an anti-symmetric part -- much like all complex numbers can be written as the sum of a real part and an imaginary part. And this is fundamentally why symmetric matrices are "special" -- for the same reason that real numbers are special.</p>
<hr>
<p>Notes:</p>
<p>(1) Scaling and skews are actually essentially the same thing, which is why it makes sense to include skews in the group of things that are "essentially real numbers", even though you can't really represent skews with any complex number -- real or otherwise. Skews are just scaling across a different set of axes, called "eigenvectors" (this is also why symmetric matrices have eigenvectors).</p>
<p>(2) My explanation of the analogy (between matrices and complex numbers) is oversimplified -- antisymmetric matrices actually represent <strong>90 degree rotations</strong> only, and these rotations can actually be spirals, which means they do scaling too. But the analogy still holds, because this applies to imaginary numbers too (e.g. the complex number <span class="math-container">$8i$</span> is a rotation by 90 degrees followed by a scaling by 8). </p>
<p>(3) A more accurate way to phrase the analogy is "the <strong>antisymmetric part</strong> of the matrix operates in a sub-space orthogonal to the vector being transformed while the <strong>symmetric part</strong> operates in the direction of the vector itself, so their sum spans all possible vectors of the target space". In other words, the analogy is to the <strong>Cartesian form</strong> of complex numbers -- you get to represent transformations as linear combinations of the vector itself and vectors orthogonal to it.</p>
<p>(4) It is possible to deal with at least some matrices in a way that corresponds to the <strong>polar forms</strong> of complex numbers -- this is done by representing matrices as products of <strong>symmetric matrices and orthogonal matrices</strong>, much like <span class="math-container">$re^{i\theta}$</span> represents complex numbers as products of real numbers and unit complex numbers.</p>Mon, 14 May 2018 08:42:41 GMThttps://math.stackexchange.com/questions/1788911/-/2780461#2780461Abhimanyu Pallavi Sudhir2018-05-14T08:42:41ZAnswer by Abhimanyu Pallavi Sudhir for Why is 1 not a prime number?
https://math.stackexchange.com/questions/120/why-is-1-not-a-prime-number/2757408#2757408
6<p>1 isn't a prime number for the same reason 0 isn't a basis vector.</p>
<p>Positive integers can be written as "almost a linear algebra" of the vector space <span class="math-container">$\mathbb{Z}_{>0}$</span> over the scalar field <span class="math-container">$\mathbb{Z_{\ge0}}$</span> (okay, this is not really a field -- it's a semiring, do what you want with it, but the idea is the same) with:</p>
<ul>
<li>Primes as the "unit basis vectors" </li>
<li>Multiplication as "vector addition" </li>
<li>Exponentiation as "scalar multiplication" (e.g. <span class="math-container">$p^k$</span> represents the scalar <span class="math-container">$k$</span> multiplied by the vector <span class="math-container">$p$</span>)</li>
<li>1 as the vector 0</li>
<li>1 as the scalar 1</li>
<li>0 is the scalar 0</li>
</ul>
<p>One may check this obeys all the axioms of linear algebra, except the existence of negatives (of vectors).</p>
<p>The reason you don't call the zero vector a basis vector is that it doesn't really add anything to the formalism if you consider "<span class="math-container">$0 + e_1 + e_2$</span>" to be the same representation as "<span class="math-container">$e_1+e_2$</span>", and if you consider it to be a different representation, you're violating the idea of each vector having a unique representation in a basis. Instead, 0 is just what you have when you haven't added anything, similarly 1 is just the empty product.</p>
<p>Note that this formalism has a lot of other interesting analogies -- for an example, co-primeness is "orthogonality". You could also extend the formalism to rationals <span class="math-container">$\mathbb{Q}$</span> over the scalar field <span class="math-container">$\mathbb{Z}$</span> -- then it would satisfy the existence of negativeness -- although co-primeness would be more complicated (e.g. 18 would be co-prime to 3/4).</p>Sat, 28 Apr 2018 12:52:57 GMThttps://math.stackexchange.com/questions/120/why-is-1-not-a-prime-number/2757408#2757408Abhimanyu Pallavi Sudhir2018-04-28T12:52:57ZComment by Abhimanyu Pallavi Sudhir on reference for higher spin - not gravitational nor stringy
https://mathoverflow.net/questions/195125/reference-for-higher-spin-not-gravitational-nor-stringy
On <a href="http://www.physicsoverflow.org/27048/reference-for-higher-spin-not-gravitational-nor-stringy?show=27499#a27499" rel="nofollow noreferrer">PhysicsOverflow</a>, there is a link to <a href="http://inspirehep.net/record/265411" rel="nofollow noreferrer">this paper</a> for the same question.Sun, 01 Mar 2015 02:25:25 GMThttps://mathoverflow.net/questions/195125/reference-for-higher-spin-not-gravitational-nor-stringy?cid=493513Abhimanyu Pallavi Sudhir2015-03-01T02:25:25ZComment by Abhimanyu Pallavi Sudhir on Classical and Quantum Chern-Simons Theory
https://mathoverflow.net/questions/159695/classical-and-quantum-chern-simons-theory
This has received an answer on PhysicsOverflow if you're still interested: <a href="http://www.physicsoverflow.org/22251/classical-and-quantum-chern-simons-theory#c22256" rel="nofollow noreferrer">Classical and Quantum Chern-Simons Theory</a>Thu, 14 Aug 2014 13:14:02 GMThttps://mathoverflow.net/questions/159695/classical-and-quantum-chern-simons-theory?cid=447277Abhimanyu Pallavi Sudhir2014-08-14T13:14:02ZComment by Abhimanyu Pallavi Sudhir on What is convolution intuitively?
https://mathoverflow.net/questions/5892/what-is-convolution-intuitively
<a href="http://en.wikipedia.org/wiki/File:Convolution_of_spiky_function_with_box2.gif" rel="nofollow noreferrer">Wikipedia</a>Fri, 17 Jan 2014 16:20:39 GMThttps://mathoverflow.net/questions/5892/what-is-convolution-intuitively?cid=396721Abhimanyu Pallavi Sudhir2014-01-17T16:20:39ZComment by Abhimanyu Pallavi Sudhir on Embedding of F(4) in OSp(8|4)?
https://mathoverflow.net/questions/111110/embedding-of-f4-in-osp84
Cross-posted to: <a href="http://physics.stackexchange.com/q/41155/23119">physics.stackexchange.com/q/41155/23119</a>Mon, 23 Dec 2013 04:35:50 GMThttps://mathoverflow.net/questions/111110/embedding-of-f4-in-osp84?cid=391443Abhimanyu Pallavi Sudhir2013-12-23T04:35:50ZComment by Abhimanyu Pallavi Sudhir on How to compare Unicode characters that "look alike"?
https://stackoverflow.com/questions/20674577/how-to-compare-unicode-characters-that-look-alike
I compared every single pixel of it, and it looks the same.Thu, 19 Dec 2013 09:26:53 GMThttps://stackoverflow.com/questions/20674577/how-to-compare-unicode-characters-that-look-alike?cid=30963612Abhimanyu Pallavi Sudhir2013-12-19T09:26:53ZComment by Abhimanyu Pallavi Sudhir on What is the definition of picture changing operation?
https://mathoverflow.net/questions/152295/what-is-the-definition-of-picture-changing-operation
Related: <a href="http://physics.stackexchange.com/q/12595/23119">physics.stackexchange.com/q/12595/23119</a>Thu, 19 Dec 2013 07:26:36 GMThttps://mathoverflow.net/questions/152295/what-is-the-definition-of-picture-changing-operation?cid=390438Abhimanyu Pallavi Sudhir2013-12-19T07:26:36ZComment by Abhimanyu Pallavi Sudhir on Understanding the intermediate field method for the $\phi^4$ interaction
https://mathoverflow.net/questions/149564/understanding-the-intermediate-field-method-for-the-phi4-interaction
@DanielSoltész: Nope, high-level questions generally get largely ignored there these days.Tue, 26 Nov 2013 14:40:20 GMThttps://mathoverflow.net/questions/149564/understanding-the-intermediate-field-method-for-the-phi4-interaction?cid=384774Abhimanyu Pallavi Sudhir2013-11-26T14:40:20ZComment by Abhimanyu Pallavi Sudhir on Intuition behind the ricci flow
https://mathoverflow.net/questions/143144/intuition-behind-the-ricci-flow/143146#143146
I was about to post the same thing, I think this is very illustrative.Tue, 19 Nov 2013 16:05:08 GMThttps://mathoverflow.net/questions/143144/intuition-behind-the-ricci-flow/143146?cid=383288#143146Abhimanyu Pallavi Sudhir2013-11-19T16:05:08ZComment by Abhimanyu Pallavi Sudhir on What is the relationship between complex time singularities and UV fixed points?
https://mathoverflow.net/questions/134939/what-is-the-relationship-between-complex-time-singularities-and-uv-fixed-points
This actually got twice the number of views here than on Physics.SE.Sun, 10 Nov 2013 14:50:44 GMThttps://mathoverflow.net/questions/134939/what-is-the-relationship-between-complex-time-singularities-and-uv-fixed-points?cid=381229Abhimanyu Pallavi Sudhir2013-11-10T14:50:44ZAnswer by Abhimanyu Pallavi Sudhir for The Fuchsian monodromy problem
https://mathoverflow.net/questions/146099/the-fuchsian-monodromy-problem/148462#148462
1<p>Equation 6.2 is just the Liovelle Action, the action principle for the <em>Liouville Field</em>, which is well-known from the familiar conformal gauge. </p>
<p>$$S_L=\frac{c}{96\pi}\int_\mathcal{M}\left(\dot\varphi^2-\frac{16\varphi}{\left(1-\lvert t\rvert^2\right)^2}\right)\mathrm{d}^2t$$ </p>
<p>... along with some trivial facts about partition functions. </p>
<p>You could of course think of it as the $Z_\mathcal{M}$'s (partition functions) of the metrics being related by the $S_L$'s in the same way that the metrics are related by the Liouvelle field. </p>
<p>And yes, I don't know how to spell "Lioivulle" properly. </p>Sun, 10 Nov 2013 06:53:28 GMThttps://mathoverflow.net/questions/146099/-/148462#148462Abhimanyu Pallavi Sudhir2013-11-10T06:53:28ZComment by Abhimanyu Pallavi Sudhir on Modular Arithmetic in LaTeX
https://mathoverflow.net/questions/18813/modular-arithmetic-in-latex
Haha, I thought this question was about typsetting a paper in $\LaTeX$Fri, 08 Nov 2013 11:34:52 GMThttps://mathoverflow.net/questions/18813/modular-arithmetic-in-latex?cid=379817Abhimanyu Pallavi Sudhir2013-11-08T11:34:52ZAnswer by Abhimanyu Pallavi Sudhir for String theory "computation" for math undergrad audience
https://mathoverflow.net/questions/47770/string-theory-computation-for-math-undergrad-audience/147307#147307
2<p>Derive the Casimir Energy in Bosonic String Theory. </p>
<p>You start with the $\hat L_0$ operator and get rid of the non-vacuum $\displaystyle\frac{\alpha_0^2}{2}+\sum_{n=1}^\infty\alpha_{-n}\cdot\alpha_n$, then you use a Ramanujam sum to do $\zeta$-function renormalisation, from which you find out that the vacuum energy denoted by $\varepsilon_0$ is </p>
<p>$$\varepsilon_0=-\frac{d-2}{24}$$ </p>
<p>However, the most interesting part comes when you go around <a href="https://mathoverflow.net/a/140354/36148">deriving</a> the critical dimension of Bosonic String Theory. </p>
<p>After which, the expression surprisingly simplifyies to a $-1$. </p>
<p>For a more detailed derivation of the above stuff, see <a href="http://arxiv.org/pdf/hep-th/0207142v1.pdf" rel="nofollow noreferrer">these</a> lecture notes/. (Section 4) (Equation 4.5-4.10) </p>Fri, 08 Nov 2013 04:33:41 GMThttps://mathoverflow.net/questions/47770/-/147307#147307Abhimanyu Pallavi Sudhir2013-11-08T04:33:41ZComment by Abhimanyu Pallavi Sudhir on Book on mathematical "rigorous" String Theory?
https://mathoverflow.net/questions/71909/book-on-mathematical-rigorous-string-theory/71998#71998
I don't think that BBS falls into the category of "mathematically rigorous". It's a very good, intuitive book.Fri, 08 Nov 2013 04:17:49 GMThttps://mathoverflow.net/questions/71909/book-on-mathematical-rigorous-string-theory/71998?cid=379753#71998Abhimanyu Pallavi Sudhir2013-11-08T04:17:49ZComment by Abhimanyu Pallavi Sudhir on About the massless supermultiplets in $2+1$ dimensional supersymmetry
https://mathoverflow.net/questions/103392/about-the-massless-supermultiplets-in-21-dimensional-supersymmetry
@S.Carnahan: The OP has voluntarily deleted it, which is weird... I have flagged this as unclear what you're asking.Wed, 06 Nov 2013 16:49:00 GMThttps://mathoverflow.net/questions/103392/about-the-massless-supermultiplets-in-21-dimensional-supersymmetry?cid=379331Abhimanyu Pallavi Sudhir2013-11-06T16:49:00ZAnswer by Abhimanyu Pallavi Sudhir for Does $SO(32) \sim_T E_8 \times E_8$ relate to some group theoretical fact?
https://mathoverflow.net/questions/57529/does-so32-sim-t-e-8-times-e-8-relate-to-some-group-theoretical-fact/147129#147129
5<p>The answer to this question can be found in Lubos Motl's answer to <a href="https://physics.stackexchange.com/q/65092/23119">this question of mine on Physics.SE</a>. </p>
<p>The key here are the weight lattices bosonic representations $\Gamma$ of these gauge groups.</p>
<p>As I understand it, the weight lattice of $E(8)$ is $\Gamma^8$, whereas the weight lattice of $\frac{\operatorname{Spin}\left(32\right)}{\mathbb{Z}_2}$^ is $\Gamma^{16}$. The first fact means that the weight lattice of $E(8)\times E(8)$ is $\Gamma^{8}\oplus\Gamma^8$, </p>
<p>Now, an identity, that $\Gamma^{8}\oplus\Gamma^8\oplus\Gamma^{1,1}=\Gamma^{16}\oplus\Gamma^{1,1} $ , which actually allows this T-Duality. Now, this means that it is <em>this very identity</em> which allows the identity mentioned in the original post. </p>
<p>So, the answer to your question is "<strong>Yes</strong>", there <em>is</em> a group-theoretical fact, and that is that $ \Gamma^{8}\oplus\Gamma^8\oplus\Gamma^{1,1}= \Gamma^{16}\oplus\Gamma^{1,1} $. </p>Wed, 06 Nov 2013 16:46:03 GMThttps://mathoverflow.net/questions/57529/does-so32-sim-t-e-8-times-e-8-relate-to-some-group-theoretical-fact/147129#147129Abhimanyu Pallavi Sudhir2013-11-06T16:46:03ZAnswer by Abhimanyu Pallavi Sudhir for Why does bosonic string theory require 26 spacetime dimensions?
https://mathoverflow.net/questions/99643/why-does-bosonic-string-theory-require-26-spacetime-dimensions/140354#140354
5<p>$$$$</p>
<p><em>Note, that here, the $\hat L_n$ are operators on the state given by the sums of the dots of the mode operators, i.e. $\hat L_0=\sum_{k=-\infty}^\infty\hat\alpha_{-n}\cdot\hat\alpha_n$.</em> </p>
<p>Also note that The Virasoro Algebra is the central extension of the Witt/Conformal Algebra so that explains why we have a $D$, it is equivalent to the central charge. </p>
<p>I'll expand on Chris Gerig's answer. </p>
<p>Not only do we need $D=26$, we also need the normal ordering constant $a=1$. The normal ordering constant is the eigenvalue of $\hat L_0$ with the eigenvector the state. </p>
<p>We want to promote the time-like states to spurious, zero-norm states, right? So, we impose the (level 1) spurious state conditions on the state as ffollows ($|\chi\rangle$ are the basis vectors to build the spurious state $\Phi\rangle$ on.) </p>
<p>$$ \begin{gathered}
0 = {{\hat L}_1}\left| \Phi \right\rangle \\
{\text{ }} = {{\hat L}_1}{{\hat L}_{ - 1}}\left| {{\chi _1}} \right\rangle \\
{\text{ }} = \left[ {{{\hat L}_{ - 1}},{{\hat L}_1}} \right]\left| {{\chi _1}} \right\rangle + {{\hat L}_{ - 1}}{{\hat L}_1}\left| {{\chi _1}} \right\rangle \\
{\text{ }} = \left[ {{{\hat L}_{ - 1}},{{\hat L}_1}} \right]\left| {{\chi _1}} \right\rangle \\
{\text{ }} = 2{{\hat L}_0}\left| {{\chi _1}} \right\rangle \\
{\text{ }} = 2{c_0}\left( {a - 1} \right)\left| {{\chi _1}} \right\rangle \\
\end{gathered} $$</p>
<p>That means that $a=1$. </p>
<p>Now, for a level 2 spurious state, </p>
<p>$$\begin{gathered}
\left[ {{{\hat L}_1},{{\hat L}_{ - 2}} + k{{\hat L}_{ - 1}}{{\hat L}_{ - 1}}} \right]\left| \psi \right\rangle = \left( {3{{\hat L}_{ - 1}} + 2k{{\hat L}_0}{{\hat L}_{ - 1}} + 2k{{\hat L}_{ - 1}}{{\hat L}_0}} \right)\left| \psi \right\rangle {\text{ }} \\
{\text{ }} = \left( {3 - 2k} \right){{\hat L}_{ - 1}} + 4k{{\hat L}_0}{{\hat L}_{ - 1}}{\text{ }}\left( {3 - 2k} \right){{\hat L}_{ - 1}} + 4k{{\hat L}_0}{{\hat L}_{ - 1}}{\text{ }} \\
0 = {{\hat L}_1}\left| \psi \right\rangle = {{\hat L}_1}\left( {{{\hat L}_{ - 2}} + k{{\hat L}_{ - 1}}{{\hat L}_{ - 1}}} \right)\left| {{\chi _1}} \right\rangle = \left( {\left( {3 - 2k} \right){{\hat L}_{ - 1}} + 4k{{\hat L}_0}{{\hat L}_{ - 1}}} \right)\left| {{\chi _1}} \right\rangle \\
{\text{ }} = \left( {\left( {3 - 2k} \right){{\hat L}_{ - 1}} + 4k{{\hat L}_{ - 1}}\left( {{{\hat L}_0} + 1} \right)} \right)\left| {{\chi _1}} \right\rangle \\
{\text{ }} = \left( {3 - 2k} \right){{\hat L}_{ - 1}}\left| {{\chi _1}} \right\rangle \\
2k = 3 \\
k = \frac{3}{2} \\
\end{gathered} $$ </p>
<p>Since this level 2 spurious state can be written as: </p>
<p>$$ {\left| \Phi \right\rangle = {{\hat L}_{ - 2}}\left| {{\chi _1}} \right\rangle + k{{\hat L}_{ - 1}}{{\hat L}_{ - 1}}\left| {{\chi _2}} \right\rangle }$$ ## </p>
<p>So, then, </p>
<p>$$ \begin{gathered}
{{\hat L}_2}\left| \Phi \right\rangle = 0 \\
{{\hat L}_2}\left( {{{\hat L}_{ - 2}} + \frac{3}{2}{{\hat L}_{ - 1}}{{\hat L}_{ - 1}}} \right)\left| {{\chi _2}} \right\rangle = 0 \\
\left[ {{{\hat L}_2},{{\hat L}_{ - 2}} + \frac{3}{2}{{\hat L}_{ - 1}}{{\hat L}_{ - 1}}} \right]\left| {{\chi _2}} \right\rangle + \left( {{{\hat L}_{ - 2}} + \frac{3}{2}{{\hat L}_{ - 1}}{{\hat L}_{ - 1}}} \right){{\hat L}_2}\left| {{\chi _2}} \right\rangle = 0 \\
\left[ {{{\hat L}_2},{{\hat L}_{ - 2}} + \frac{3}{2}{{\hat L}_{ - 1}}{{\hat L}_{ - 1}}} \right]\left| {{\chi _2}} \right\rangle = 0 \\
\left( {13{{\hat L}_0} + 9{{\hat L}_{ - 1}}{{\hat L}_{ + 1}} + \frac{D}{2}} \right)\left| {{\chi _2}} \right\rangle = 0 \\
\frac{D}{2} = 13 \\
\text{Since $L_0|\chi_2\rangle = -|\chi_2\rangle$ and $L_{+1}|\chi_2\rangle=0$, we have }
D = 26 \\
\end{gathered} $$ \ </p>
<p>And then, finally,</p>
<p>Q.E.D. </p>
<p>So, this was done essentially to remove the imaginary norm ghost states and using the Canonical / Gupta - Bleuer formalism. </p>
<p>It's also possible to use , say, e.g. Light Cone Gauge (LCG) quantisation. However, in other quantisation methods, the conformal anomaly is manifest in other forms. E.g., in LCG quantisationn, it is manifest as a failure of lorentz symmetry. A good overview of this method can be found in <strong>Kaku</strong> <em>Strings, Conformal fields, and M-theory</em> (it's the only part of the book that I liked, actually. The rest of the book is too rigorous, without much physical intuition.). </p>Sun, 25 Aug 2013 09:40:17 GMThttps://mathoverflow.net/questions/99643/why-does-bosonic-string-theory-require-26-spacetime-dimensions/140354#140354Abhimanyu Pallavi Sudhir2013-08-25T09:40:17ZAnswer by Abhimanyu Pallavi Sudhir for Coincidence, purposeful definition, or something else in formulas for energy
https://physics.stackexchange.com/questions/71119/coincidence-purposeful-definition-or-something-else-in-formulas-for-energy/71121#71121
4<p>Most of them (all of your examples except <span class="math-container">$E=c^2m$</span>, which is really just <span class="math-container">$E=m$</span> anyway) arise from integrating a linear equation like <span class="math-container">$p=mv$</span> as <span class="math-container">$E=\int v\,dp$</span>, and it is often just a convention that we choose the linear relation to have a constant of proportionality of 1, so the integral has a constant of 1/2 (for example, we could've instead chosen, like we do with areas of circles, to have <span class="math-container">$c=2\pi r$</span> and <span class="math-container">$A=\pi r^2$</span>). </p>Mon, 15 Jul 2013 04:01:14 GMThttps://physics.stackexchange.com/questions/71119/-/71121#71121Abhimanyu Pallavi Sudhir2013-07-15T04:01:14ZAnswer by Abhimanyu Pallavi Sudhir for Is velocity of light constant?
https://physics.stackexchange.com/questions/66856/is-velocity-of-light-constant/68513#68513
1<p>There are two questions here -- is the velocity of light <em>constant</em>, and is it <em>invariant</em>?</p>
<p>The direction/velocity of light changes whenever it interacts with something. This includes gravitational deflection, since things have to change direction in curved spacetime in one sense or another. The velocity isn't constant.</p>
<p>Is it invariant under Lorentz boosts in perpendiculal directions? <em>No.</em> The speed is invariant, but the velocity isn't. This should be fairly clear, but you can prove it with brute force --</p>
<p>We need to apply a boost to light's four-velocity, but the four-velocity of light is actually infinite -- it's (infinity, infinity, 0, 0), except the infinities satisfy a certain relation in the sense of being related through a limit. So we consider an object traveling at speed $w$ in the $x$-direction, boost $v$ in the $y$-direction and let $w\to c$. The four-velocity transforms under this boost as:</p>
<p>$$\left[ {\begin{array}{*{20}{c}}{\gamma (w)}\\{w\gamma (w)}\\0\\0\end{array}} \right] \to \left[ {\begin{array}{*{20}{c}}{\gamma (v)\gamma (w)}\\{w\gamma (w)}\\{ - v\gamma (v)\gamma (w)}\\0\end{array}} \right]$$</p>
<p>The conventional 3-velocity can be extracted here by considering $dx/dt$, $dy/dt$:</p>
<p>$$\frac{{dx}}{{dt}} = \frac{{dx/d\tau }}{{dt/d\tau }} = \frac{{w\gamma (w)}}{{\gamma (v)\gamma (w)}} = \frac{w}{{\gamma (v)}}$$
$$\frac{{dy}}{{dt}} = \frac{{dy/d\tau }}{{dt/d\tau }} = \frac{{ - v\gamma (v)\gamma (w)}}{{\gamma (v)\gamma (w)}} = - v$$</p>
<p>Taking the limit as $w\to 1$, you get a 3-velocity of $(1/\gamma(v),-v, 0)$ -- one may confirm that this is not equivalent to the original three-velocity that was $(1,0,0)$, but nonetheless has the same magnitude (speed is invariant).</p>Wed, 19 Jun 2013 04:17:58 GMThttps://physics.stackexchange.com/questions/66856/-/68513#68513Abhimanyu Pallavi Sudhir2013-06-19T04:17:58ZAnswer by Abhimanyu Pallavi Sudhir for Capacitors' working in a circuit
https://physics.stackexchange.com/questions/68387/capacitors-working-in-a-circuit/68426#68426
2<p>The answer is just "yes, obviously, the voltage is zero". The answer below is unnecessarily computational, but I'm keeping it in case someone likes that.</p>
<hr>
<p><strong>Archived answer</strong></p>
<p>I'll assume you are talking about an circuit with a capacitor and resistor inside. Then, let $Q$ be the charge, $t$ be the time, $C$ be the capacitance, $R$ be the resistance, $T$ be the time constant, and $V$ be the electromotive force. You must know of the differential equation:
$$\frac{{{\text{d}}Q}}{{{\text{d}}t}} = \frac{{ - Q + CV}}{T}$$</p>
<p>Separating the equation and integrating,
$$\frac{{{\text{d}}Q}}{{{\text{d}}t}} = \frac{V}{R}\exp \left( { - \frac{t}{T}} \right)$$</p>
<p>Since the current is the rate of change in the charge with respect to time, we can rewrite this equation in the following form:</p>
<p>$$I = \frac{V}{R}\exp \left( { - \frac{t}{T}} \right)$$</p>
<p>So, the potential of the battery being equal to the potential of the capacitor simply means that $V=0$, so
$$I = \frac{0}{R}\exp \left( { - \frac{t}{T}} \right)=0$$</p>
<p>So yes, although it will take an infinite amount of time to reach this point.</p>Tue, 18 Jun 2013 06:20:12 GMThttps://physics.stackexchange.com/questions/68387/-/68426#68426Abhimanyu Pallavi Sudhir2013-06-18T06:20:12ZAnswer by Abhimanyu Pallavi Sudhir for Measuring extra-dimensions
https://physics.stackexchange.com/questions/22542/measuring-extra-dimensions/68414#68414
4<p>The standard way to measure compactified dimensions is to test some inverse-square law (e.g. Newton's, electromagnetic, diffusion) at the scale and see if it breaks down and starts approaching some other (higher power) inverse-power law.</p>
<p>In fact, the inverse-square law has only been verified down to a scale of 0.1mm -- here's a recent experimental paper doing this: <a href="http://arxiv.org/abs/hep-ph/0011014v1" rel="nofollow noreferrer">[1]</a>.</p>
<p>(Yes, you can measure time in metres, by multiplying by the speed of light. This is where "lightseconds" and other such measurements of distance come from. An example motivation for treating this as the unit of the time dimension is from the Minkowski metric, $ds^2=c^2dt^2-dx^2-dy^2-dz^2$, where $ct$ is a dimension analogous to the spatial ones.)</p>Tue, 18 Jun 2013 04:16:35 GMThttps://physics.stackexchange.com/questions/22542/-/68414#68414Abhimanyu Pallavi Sudhir2013-06-18T04:16:35ZAnswer by Abhimanyu Pallavi Sudhir for A change in the gravitational law
https://physics.stackexchange.com/questions/41109/a-change-in-the-gravitational-law/68326#68326
5<p>Such a change requires a 4+1-dimensional spacetime instead of a 3+1-dimensional one -- this would have several serious implications --</p>
<ol>
<li><p>The Riemann curvature tensor gains new "parts" with interesting physical implications with each new spacetime dimension -- 1-dimensional manifolds have no curvature in this sense, 2-dimensional manifolds have a scalar curvature, 3-dimensional manifolds gain the full Ricci tensor, 4-dimensional manifolds get components corresponding to a new Weyl tensor and 5-dimensional geometry gets even more components, and general relativity in this spacetime is capable of explaining electromagnetism, too, so electromagnetism (along with the radion field) starts behaving as a part of gravity.</p></li>
<li><p>Apparently a 5-dimensional spacetime is unstable, according to wikipedia's "privileged character of 3+1-dimensional spacetime"<a href="http://en.wikipedia.org/wiki/Spacetime#Privileged_character_of_3.2B1_spacetime" rel="nofollow noreferrer">[1]</a> (now a transclusion of <a href="https://en.wikipedia.org/wiki/Anthropic_principle#Dimensions_of_spacetime" rel="nofollow noreferrer">[2]</a>).</p></li>
<li><p>The string theory landscape would be a bit smaller, since there are less dimensions to compactify.</p></li>
<li><p>The Ricci curvature in a vacuum on an Einstein Manifold would no longer be exactly $\Lambda g_{ab}$. There will be a coefficient of 2/3.</p></li>
<li><p>The magnetic field, among other things "cross product-ish", could not be written as a vector, unlike the electric field. This is because it would have 6 components whereas the spatial dimension is only 4. So, perhaps humans would become familiar with exterior algebras earlier than us who live in 3+1 dimensions. Either that or we would be trying to find out how magnetism works. Or we would just die out, for all the other reasons.</p></li>
<li><p>In string theory (see e.g. <a href="http://arxiv.org/abs/hep-th/0207249v1" rel="nofollow noreferrer">[3]</a>), gravitational constants in successively higher dimensions are calculated as $G_{n+1}=l_sG_n$, where $l_s$ is the string length (the units must be different in order to accomodate the extra factor of $r$ in Newton's gravitational law). For distance scales greater than the string length, this causes gravity to be much weaker than in our number of dimensions, but stronger for length scales shorter than the string length. It's interesting how gravity's long-range ability peaks at 4 dimensions (it is a contact force below 4 dimensions).</p></li>
</ol>
<p>See also some recent tests of the inverse square law at short length scales (to check for compactification -- <a href="http://arxiv.org/abs/hep-ph/0011014" rel="nofollow noreferrer">[4]</a>.</p>Mon, 17 Jun 2013 10:12:52 GMThttps://physics.stackexchange.com/questions/41109/-/68326#68326Abhimanyu Pallavi Sudhir2013-06-17T10:12:52ZAnswer by Abhimanyu Pallavi Sudhir for Mass of a superstring between two branes?
https://physics.stackexchange.com/questions/46118/mass-of-a-superstring-between-two-branes/68240#68240
2<p>It's similar -- </p>
<p>$${m^2} = \left( {N - a} \right) + {\left( {\frac{y}{{2\pi }}} \right)^2}$$</p>
<p>The important difference is that the number operator and normal ordering constant change for a superstring, and vary by sector.</p>Sun, 16 Jun 2013 11:12:27 GMThttps://physics.stackexchange.com/questions/46118/-/68240#68240Abhimanyu Pallavi Sudhir2013-06-16T11:12:27ZAnswer by Abhimanyu Pallavi Sudhir for How is it that angular velocities are vectors, while rotations aren't?
https://physics.stackexchange.com/questions/286/how-is-it-that-angular-velocities-are-vectors-while-rotations-arent/65738#65738
6<p>You are mixing up different things. A rotation transformation is a transformation of vectors in a linear space -- such a transformation doesn't need to have any angular velocities or anything, and it doesn't even need to have anything to do with a mechanical rotation.</p>
<p>The angular velocity is the rate of a physical rotation, measured as $\vec\omega=d\vec\theta/dt$, where $\vec\theta$ is <em>also</em> a vector, the rotational analog of displacement.</p>
<p>In any case, the $\vec\theta$ is not the same as the matrix of rotation. The latter is a <em>function</em> of $\vec\theta$, but a matrix can be used to represent a lot more things than just a rotation. Note that a rotation can still be modelled as a time-dependent matrix itself, like $\vec{x}(t)=A(t)\vec{x}(0)$, but the matrix is still not the same as the angle of rotation.</p>
<hr>
<p>Note: I've been a bit sneaky in claiming that $\vec\theta$ is a "vector" -- it's really not, although it happens to have 3 components in 3 dimensions so it's conventional to write the "xy" component as the "z" component, "xz" as the "y" component, "yz" as "x", but in general it's best to think of angles as (2, 0) tensors $\theta^{\mu\nu}$. Interestingly, the rotation transformation is a (1, 1) tensor $A^{\mu}{}_{\nu}$.</p>Fri, 24 May 2013 12:20:55 GMThttps://physics.stackexchange.com/questions/286/-/65738#65738Abhimanyu Pallavi Sudhir2013-05-24T12:20:55ZAnswer by Abhimanyu Pallavi Sudhir for Can someone please explain magnetic vs electric fields?
https://physics.stackexchange.com/questions/53916/can-someone-please-explain-magnetic-vs-electric-fields/65091#65091
3<p>The electric and magnetic fields arise as Lorentz duals of each other, with them mixing and transforming between each other through Lorentz boosts. The full picture of the field comes from the electromagnetic field tensor</p>
<p>$$F_{\mu\nu} = \begin{bmatrix}
0 & E_x/c & E_y/c & E_z/c \\
-E_x/c & 0 & -B_z & B_y \\
-E_y/c & B_z & 0 & -B_x \\
-E_z/c & -B_y & B_x & 0
\end{bmatrix}$$</p>
<p>Which satisfies simple identities (see <a href="https://en.wikipedia.org/wiki/Electromagnetic_tensor#Significance" rel="nofollow noreferrer">[1]</a>) equivalent to Maxwell's equations. The electric and magnetic fields are different components of this tensor, placed in similar positions as e.g. the momemtnum and shear stress in the 4d stress tensor.</p>Sun, 19 May 2013 05:01:31 GMThttps://physics.stackexchange.com/questions/53916/-/65091#65091Abhimanyu Pallavi Sudhir2013-05-19T05:01:31Z