Jekyll2018-01-31T00:58:32-05:00//Andrew TanWelcome to my site!
introductory quantum information reading list2018-01-31T00:57:00-05:002018-01-31T00:57:00-05:00/jekyll/update/2018/01/31/quantum-information-reading<h3 id="story">story</h3>
<p>The following post was adapted from a report I wrote for a reading course.
I might add to this as I continue reading.
I think it serves as a decent starting point for those interested in getting started with quantum computation and quantum cryptography.
This post isn’t self-contained; think of it as a pointer to some early seminal papers in the field.</p>
<p>If you’re interested in the actual mathematical formalisms common in quantum information theory, I would highly “Quantum Computation and Quantum Information” by Nielsen and Chuang.</p>
<h3 id="introduction">introduction</h3>
<p>Encoding information in quantum systems provides an opportunity for the development of novel communication and computing architectures that exploit unitary quantum dynamics.
So-called quantum parallelism provides opportunities for speedups over classical computation and the quantum no-cloning theorem allows for the construction of cryptographic protocols that derive security from fundamental physics.</p>
<h3 id="quantum-computation">quantum computation</h3>
<h4 id="universal-quantum-computer">universal quantum computer</h4>
<p>It is useful to have a concrete model in mind when talking about quantum computers.
Early work by Deutsch [1] extended the Church-Turing hypothesis for classical computers through the proposal of a quantum Turing machine, a simplistic model that captures computational power of any computation that can be performed with quantum unitary dynamics.
Deutsch showed that a universal quantum computer could not compute non-recursive functions demonstrating that quantum computers solve the same set of problems as classical computers.</p>
<p>Bernstein and Vazirani [2] later constructed an efficient quantum Turing machine and demonstrated a key relationship between problems that could be solved by a universal quantum computer with high probability in polynomial time, the complexity class known as bounded-error quantum polynomial time (\(BQP\)), and problems solvable by a classical Turing machine (with an oracle for randomness) with high probability in polynomial time, the complexity class known as bounded-error probabilistic polynomial time (\(BPP\), a superset of \(P\)).
Formally, they showed that \(P \subseteq BPP\subseteq BQP\), with equality if and only if \(P=PSPACE\); it is commonly believed that this last equality does not hold implying proper containment of \(BPP\)by \(BQP\).
Later, Bennett et al. [3] gave evidence that \(NP \not\subseteq BQP\), however this is also not proven.
To summarize, it is commonly believed (but not proven) that \(NP \not\subseteq BQP\), but \(P \subsetneq BQP\).
This means problems that are provably NP-complete are likely safe from super-polynomial quantum speedup, but classically difficult problems (in \(NP \setminus P\)) could be solved by efficiently by quantum computers – some of these problems are described below.</p>
<h4 id="quantum-speedup">quantum speedup</h4>
<p>Significant classes of classically intractable problems have been shown to be feasible with a universal quantum computer.
Algorithms providing exponential speedups have been proposed for problems upon which the security of classical cryptographic primitives are based.
Efficient algorithms for The (Abelian) Hidden Subgroup Problem, upon which the security of popular cryptographic protocols including the RSA cryptosystem and Diffie-Hellman key exchange are based [4], have been demonstrated.
This has sparked interest in the development of cryptosystems resistant to quantum speedup, so-called post-quantum cryptography [5].
Shor’s algorithm is a polynomial-time quantum for prime factorization and computation of the discrete logarithm [6].
These algorithms are based on finding the period of hidden subgroups and rely on the efficiency of the Quantum Fourier Transform [4].
Significant technical challenges need to be overcome before quantum supremacy can be practically demonstrated; however, the promise of exponential speedup presents a clear challenge to modern cryptographers.
The complexity theoretic analysis also does not rule out the possibility of useful polynomial-time speedups of potentially NP-complete problems.
Grover [7] proposed a database search algorithm with a quadratic speedup over classical methods. Grover’s algorithm does not require the hidden subgroup structure of Shor’s algorithm, but nonetheless provides significant practical advantage for brute-force attacks on cryptography.</p>
<h4 id="practical-quantum-computers">practical quantum computers</h4>
<p>There are several proposed models for realizing universal quantum computation: the most popular being the quantum circuit mode and adiabatic quantum computing. Both approaches have been proven to be polynomially equivalent [8].
Maintaining coherence throughout quantum computation is the largest challenge to producing a practical quantum computer: effective quantum error correction is key for any practical computing scheme. Quantum error correction was first proposed by Shor [9].
Gottesman [10] introduced a group-theoretic stabilizer formalism that is commonly used for describing quantum error correcting codes.</p>
<h3 id="quantum-cryptography">quantum cryptography</h3>
<h4 id="defining-security">defining security</h4>
<p>The benchmark for security of classical cryptographic protocols is a notion known semantic security often settled for instead of the stronger notion of perfect security, since, while possible, is practically infeasible [11].
The security of quantum cryptosystem is not based on information theoretic grounds but physical ones. The commonly accepted notion of the security of a quantum key distribution protocol, proposed by Mayers [12], is known as unconditional security.</p>
<h4 id="quantum-cryptographic-primitives">quantum cryptographic primitives</h4>
<p>The quantum no-cloning theorem, first described by Wooters and Zurek [13] is the basis for much of quantum cryptography.
Early work by Wiesner [14], using the consequence of the no-cloning property, showed the promise of using two-state quantum systems, qubits, for cryptography.
Wiesner demonstrated a technique for encoding that would make it highly improbably that attackers could undetectably disturb quantum information encoded in conjugate bases. Bennett and Brassard [15] used this idea to devise the BB84 quantum key exchange protocol and a coin-tossing algorithm.
Bennett Brassard’s coin-tossing protocol can be used to implement a cryptographic primitive known as bit commitment, whereby one party wishes to verifiably commit to a choice of bit without revealing information about the committed bit until later.
Mayers [16] and independently, Lo and Chau [17] showed that the two key requirements of a quantum bit commitment scheme were incompatible.
Namely, no quantum protocol could assure that no information about the committed bit could be gained before the reveal, and the immutability of the committed bit, simultaneously.</p>
<h3 id="references">references</h3>
<p>[1] D. Deutsch, “Quantum theory, the Church-Turing principle and the universal quantum computer,” Proceedings of the Royal Society of London A, pp. 97-117, 1985.</p>
<p>[2] E. Bernstein and U. Vazirani, “Quantum Complexity Theory,” SIAM Journal on Computing, vol. 26, no. 5, pp. 1411-1473, 1997.</p>
<p>[3] C. H. Bennett, E. Bernstein, G. Brassard and U. Vazirani, “Strengths and Weaknesses of Quantum Computing,” SIAM Journal on Computing, vol. 26, no. 5, pp. 1510-1523, 1997.</p>
<p>[4] M. Nielsen and I. Chuang, Quantum Computation and Quantum Information, New York: Cambridge University Press, 2010.</p>
<p>[5] D. Stebila, Practical post-quantum key exchange, Cambridge: QCrypt 2017, 2017.</p>
<p>[6] P. W. Shor, “Polynomial-Time Algorithms for Prime Factorization and Discrete Logarithms on a Quantum Computer Read More: http://epubs.siam.org/doi/10.1137/S0097539795293172,” SIAM Journal on Computing, vol. 26, no. 5, pp. 1484-1509, 1997.</p>
<p>[7] L. K. Grover, “A fast quantum mechanical algorithm for database search,” in Proceedings of the twenty-eighth annual ACM symposium on Theory of computing, Philadelphia, 1996.</p>
<p>[8] D. Aharonov, W. van Dam, J. Kempe, Z. Landau, S. Lloyd and O. Regev, “Adiabatic Quantum Computation is Equivalent to Standard Quantum Computation,” SIAM Journal on Computing, vol. 37, no. 1, pp. 166-194, 2007.</p>
<p>[9] P. W. Shor, “Scheme for reducing decoherence in quantum computer memory,” Phys. Rev. A, vol. 52, no. 4, pp. R2493-R2496, 1995.</p>
<p>[10] D. Gottesman, “Stabilizer Codes and Quantum Error Correction,” Pasadena, 1997.</p>
<p>[11] C. E. Shannon, “Communication theory of secrecy systems,” The Bell System Technical Journal, vol. 28, no. 4, pp. 656-715, 1949.</p>
<p>[12] D. Mayers, “Unconditional security in quantum cryptography,” Journal of the ACM, vol. 48, no. 3, pp. 351-406, 2001.</p>
<p>[13] W. K. Wootters and W. H. Zurek, “A single quantum cannot be cloned,” Nature, vol. 299, pp. 802-803, 1982.</p>
<p>[14] S. Wiesner, “Conjugate Coding,” ACM SIGACT News, vol. 15, no. 1, pp. 78-88, 1983.</p>
<p>[15] C. H. Bennett and G. Brassard, “Quantum cryptography: Public key distribution and coin tossing,” Proceedings of IEEE International Conference on Computers, Systems and Signal Processing, vol. 175, p. 8, 1984.</p>
<p>[16] D. Mayers, “The trouble with quantum bit commitment,” 1996.</p>
<p>[17] H.-K. Lo and H. F. Chau, “Is Quantum Bit Commitment Really Possible?,” Phys. Rev. Lett, vol. 78, no. 17, p. 3410, 1997.</p>storyR/Z piano tuning2017-07-10T23:22:00-04:002017-07-10T23:22:00-04:00/jekyll/update/2017/07/10/modulo-harmony<h3 id="story">story</h3>
<p>Discussion regarding piano tuning is already widespread on the internet.
<a href="https://youtu.be/1Hqm0dYKUx4">This</a> YouTube video from Minute Physics provides a decent introduction to the topic.</p>
<p>I ended up writing a short MATLAB script to plot some of these different tuning systems and see if I could hear the difference.
Here are some of the things I tried.
Most of these have actual names in music theory: I’ve tried to link to their Wikipedia page when possible.</p>
<h3 id="introduction">introduction</h3>
<p>Firstly, here are the standard <a href="https://en.wikipedia.org/wiki/Interval_(music)#Size_of_intervals_used_in_different_tuning_systems">just intervals</a> for reference.</p>
<p>Since we’re basing things off the octave (2:1), it’s useful to express other intervals in terms of their base 2 logarithm.
This translates all musical intervals to a real number.
It seems intuitive to take intervals modulo 1 (i.e. the same note across octaves are identified).
This has a group structure isomorphic to \(\mathbb{R}/\mathbb{Z}\).</p>
<p>For a visualization respecting the group structure, all the notes within an octave are plotted on a circle.
The standard 12-tone <a href="https://en.wikipedia.org/wiki/Equal_temperament">equal temperament</a> (12-TET) is labeled around the circle for easy comparison.</p>
<h3 id="harmonic-generator">harmonic generator</h3>
<p>Let’s choose some justly tuned intervals and compare the notes they generate to the familiar 12-TET scale.</p>
<p>The minor 2nd is the basic unit and will generate approximations to the 12-TET scale within the first order
Unfortunately, it’s off by over a semitone by the time it completes the entire octave
<img src="/assets/images/harm-min2_13.png" alt="minor 2nd generator" /></p>
<p>Here’s the circle with the justly tuned perfect 5th as a generator.
This is one way to visualize <a href="https://en.wikipedia.org/wiki/Pythagorean_tuning">Pythagorean tuning</a>.
<img src="/assets/images/harm-per5_13.png" alt="perfect 5th generator" /></p>
<p>Here’s a standard major scale using justly tuned major and minor 2nds.
<img src="/assets/images/harm-ionian.png" alt="Ionian mode" /></p>
<p>Here’s what A major sounds like using different tunings.
The three iterations of A major are tuned using equal temperament, just major and minor 2nds and the third iteration plays both together so the beats can be heard.</p>
<audio controls="" style="margin-left: auto;margin-right:auto;display:block">
<source src="/assets/sounds/harm-ionian.mp3" type="audio/mpeg" />
Your browser does not support the audio element.
</audio>
<p>Let’s extend these a bit for fun.
Here’s how the perfect 5th (3:2) drifts around the circle.</p>
<p><img src="/assets/images/harm-per5_130.png" alt="perfect 5th generator 130 iterations" /></p>
<p>Here’s a minor 3rd (6:5).
To be honest, I’m including this mainly because it looks cool.
This is because the 19th element in the geometric series \(\left(\frac{6}{5}\right)^n\) nearly approximates a power of 2 creating a near-orbit (let me know if there’s a better term for this) of size 20.</p>
<p><img src="/assets/images/harm-min3_130.png" alt="minor 3rd generator 130 iterations" /></p>
<p>It might be interesting to find other near-orbits.</p>
<p>That is,</p>
<p>\[
n \log_2{r} \approx 1\pmod 1
\]</p>
<p>Of course the most interesting near-orbits are those where \(n\) is small (i.e. the number of notes in the orbit are small) and \(r\) is a ratio of relatively small integers.</p>
<p>Quick note – we know that is impossible to generate an exact orbit with \(r \in \mathbb{Q}\) since \(\forall p \in \mathbb{Z}, p>1 \implies \sqrt[p]{2} \notin \mathbb{Q}\) (this is why we have to turn to irrationals to tune our pianos in the first place).</p>
<p>The standard 12-TET is the case where \(n=12\) and \(r=\sqrt[12]{2}\).</p>
<h3 id="discussion">discussion</h3>
<p>I’m not sure if this post lends itself to any practical use, but I think these diagrams are pretty.</p>story[note] useful commutator relations2017-05-08T16:25:00-04:002017-05-08T16:25:00-04:00/jekyll/update/2017/05/08/commutator-relations<h3 id="story">story</h3>
<p>This is going to be a quick post.
Trying to look up common commutation relationships is a pain.
I’m going to compile a list of these relationships that I’ve found useful.
This post should be treated as a reference.
I will likely update this post as time goes on.</p>
<h3 id="quick-introduction">quick introduction</h3>
<p>Unfortunately, a lot of useful algebras are not commutative.
The commutator describes how close two elements are to commuting and often encapsulates some interesting properties.</p>
<p>\[
[A,B] = AB - BA
\]</p>
<p>For the rest of this post, capital letters \(A\), \(B\), \(C\) etc. will be used to denote elements of a non-commutative algebra and lowercase letters will be used to denote elements of the field.</p>
<h3 id="basic-commutator-algebra">basic commutator algebra</h3>
<p>Addition within the commutator,
\[
[A+B,C]=[A,C]+[B,C]
\]</p>
<p>Multiplication within the commutator,
\[
[A,BC]=B[A,C]+[A,B]C
\]</p>
<p>This one shows up sometimes when composing unitary transformations,
\[
\frac{\partial}{\partial t}(e^{tA}Be^{-tA})=e^{tA}[A,B]e^{-tA}
\]</p>
<h3 id="baker-campbell-hausdorf--lite">Baker-Campbell-Hausdorf (-lite)</h3>
<p>The full <a href="https://en.wikipedia.org/wiki/Baker%E2%80%93Campbell%E2%80%93Hausdorff_formula">BCH</a> relationship is very general and unwieldy.
Luckly, in a lot of situations the commutator commutes with the original elements truncating the BCH formula after the first commutator term.
Small note: this might actually be known as the Zassenhaus formula, with the real BCH being its dual, but I think most people would identify this as the BCH relationship.</p>
<p>If
\[
[A,[A,B]]=[B,[A,B]]=0
\]
then,
\[
e^{t(A+B)} = e^{tA}e^{tB}e^{-\frac{t^2}{2}[A,B]}
\]</p>
<p>The two relationships below are also closely related to the BCH expansion,</p>
<p>\[
e^BAe^{-B}=A+[B,A]
\]</p>
<p>\[
Ae^B = e^BA+[A,B]e^B
\]</p>
<p>the first one is also known as the Hadamard lemma.</p>
<h3 id="jacobi-identity">Jacobi identity</h3>
<p>Another one that shows up frequently is the Jacobi identity.</p>
<p>\[
[A,[B,C]] + [B,[C,A]] + [C,[A,B]] = 0
\]</p>
<p>Rewritten, the Jacobi identity can be seen as a generalization of associativity,</p>
<p>\[
[[A,B],C] = [A,[B,C]] - [B,[A,C]]
\]</p>
<p>The standard cross product also satisfies the Jacobi identity (the lowercase letters below are elements of the vector space \(\mathbb{R}^3\)),</p>
<p>\[
(a \times b) \times c = a \times (b \times c) - b \times (a \times c)
\]</p>
<h3 id="revision-history">revision history</h3>
<p>May 9 – started post</p>storyexploring isometric embeddings of Riemannian manifolds2017-02-02T16:46:00-05:002017-02-02T16:46:00-05:00/jekyll/update/2017/02/02/embedding<h3 id="story">story</h3>
<p>I’ve been trying to make my way through some concepts in differential geometry and tensor calculus, and I’ve been having difficulty visualizing the effect of a changing metric tensor.</p>
<p>My intuition has been to picture manifolds embedded in an Euclidean space \(\mathbb{R}^n\), and this appears to be the approach taken in most books.
However, it is not immediately clear that this intuitive picture is equivalent to the typical treatment of a manifold as a topological space described with coordinate charts.
These two pictures are known as the <a href="https://en.wikipedia.org/wiki/Manifold#Intrinsic_and_extrinsic_view">extrinsic and intrinsic</a> pictures respectively.</p>
<p>This post will focus on isometric embeddings, as they seem more tangible.</p>
<p>Here’s my attempt at visualizing this concept through embedding of a 2D manifold equipped with an arbitrary metric tensor in \(\mathbb{R}^3\).
I also calculate a few other quantities on the manifold like the <a href="https://en.wikipedia.org/wiki/Levi-Civita_connection">Levi-Civita connection</a> and the <a href="https://en.wikipedia.org/wiki/Riemann_curvature_tensor">Riemann tensor</a> and the <a href="https://en.wikipedia.org/wiki/Ricci_curvature">Ricci</a> things to help get a better understanding of how that all relates to the metric.</p>
<h3 id="hand-wavy-introduction-to-manifolds">hand-wavy introduction to manifolds</h3>
<p>While reading this, please keep in mind that I’m probably the least qualified person to be writing about this subject.</p>
<p>I’m going to attempt to motivate and explain a few terms that will be key to limiting the scope of this post.</p>
<h4 id="riemannian">‘Riemannian’</h4>
<p>A Riemannian manifold is a manifold equipped with a Riemannian metric.
A Riemannian metric allows for the definition of an inner product on the tangent space \(T_pM\) on every point \(p\) on the manifold \(M\).</p>
<p>More specifically, the Riemannian metric is a positive definite map between two elements of the tangent space to the reals.</p>
<p>\[
g_p : T_pM \times T_pM \mapsto \mathbb{R}, \forall p \in M
\]</p>
<p>If a basis is specified, \(g_p\) can be expressed as a positive definite \(m \times m\) matrix, where \(m\) is the dimension of the manifold.</p>
<h4 id="embedding">‘embedding’</h4>
<p>An embedding in topology between two topological spaces \(X\) and \(Y\) is a continuous injective mapping</p>
<p>\[
f : X \mapsto Y
\]</p>
<p>Embedding a manifold in a some canonical space such as \(\mathbb{R}^n\) is typically used as a way to visualize the manifold.
For example, imagining the 2-sphere in \(\mathbb{R}^3\) can be viewed as an embedding.</p>
<p>In many descriptions the 2-sphere is defined explicitly in reference to \(\mathbb{R}^3\) (i.e. as the set of points equidistant from a reference point (as measured in \(\mathbb{R}^3\))); however, this picture is limiting and misleading in topology.
It is important to see a manifold like the 2-sphere as an object independent of some external embedding.
This allows for much more general geometries.</p>
<h4 id="isometric">‘isometric’</h4>
<p>When a manifold is equipped with a metric, it is possible to measure distances between points on the manifold.
Isometric embeddings are those that preserve these distances when using the standard metric in \(\mathbb{R}^n\),</p>
<p>This post will focus on isometric embeddings as I think they provide more physical intuition.
Imagine manifolds made of a flexible but non-stretchable material like paper instead of rubber.</p>
<h4 id="compact">‘compact’</h4>
<p>There is a rigorous definition of compactness which I have not spent the time to understand, but the notion of <a href="http://mathworld.wolfram.com/CompactManifold.html">compactness</a> is quite intuitive.
If there is a finite distance between any two points on a manifold, it is compact (e.g. sphere is compact, plane is not).</p>
<p>A less obvious fact about compact manifolds is that they are completely specified by their <a href="https://en.wikipedia.org/wiki/Orientability">orientability</a> and <a href="https://en.wikipedia.org/wiki/Genus_(mathematics)">genus</a>.</p>
<h3 id="sphere-example">2-sphere example</h3>
<p>Let’s start with a concrete example.</p>
<p>For the purposes of this post, I will be looking at 2D manifolds embedded in \(\mathbb{R}^3\) in the interest of being able to visualize the manifold.
I will also be using the Euclidean metric of signature \((0,2)\) for simplicity.</p>
<p>Let’s choose some coordinates for our manifold</p>
<p>\[
\xi^a = \hat{\xi^a}(x^1,x^2) = \hat{\xi^a}(x^\mu)
\]</p>
<p>where \(a \in {1,2,3}\) and \(\mu \in {1,2}\).
I’m using largely standard notation so I will gloss over the finer details.</p>
<h4 id="metric-tensor">metric tensor</h4>
<p>The metric tensor \(g_{\mu\nu}\) allows us to measure arclengths.
I usually remember it in the following context</p>
<p>\[
ds^2 = g_{\mu\nu} dx^\mu dx^\nu
\]</p>
<p>Of course, the metric is also related to the covariant basis, which we can easily obtain now that we have a coordinate system</p>
<p>\[
g_{\mu\nu} = \vec{e_\mu} \cdot \vec{e_\nu} = \frac{\partial \xi^a}{\partial x^\mu} \frac{\partial \xi^b}{\partial x^\nu} \eta_{ab}
\]</p>
<p>where \(\eta_{ab}=I_2\), the \(2 \times 2\) identity matrix in this case (since we’re using a Euclidean metric signature).</p>
<p>To make all of this more concrete, let’s look at the 2-sphere with the typical spherical coordinates</p>
<p>\[
\xi^1 = \rho \sin{x^1} \cos{x^2}
\]</p>
<p>\[
\xi^2 = \rho \sin{x^1} \sin{x^2}
\]</p>
<p>\[
\xi^3 = \rho \cos{x^1}
\]</p>
<p>The metric tensor in these coordinates is</p>
<p>\[
(g)_{\mu\nu} =
\begin{bmatrix}
\rho^2 & 0 \\
0 & \rho^2\sin^2{x^1} \\
\end{bmatrix}
\]</p>
<h4 id="levi-civita-connection">Levi-Civita Connection</h4>
<p>Let’s quickly compute the Levi-Civita connection for the 2-sphere.
I will try to demonstrate the geometric intuition for this later.</p>
<p>\[
\Gamma^\sigma_{\mu\nu} = \frac{1}{2} g^{\sigma\rho}(\partial_\mu g_{\nu\rho} + \partial_\nu g_{\rho\mu} - \partial_\rho g_{\mu\nu})
\]</p>
<p>where \(\partial_\mu \equiv \frac{\partial}{\partial x^\mu}\).</p>
<p>Applying the above equations,</p>
<p>\[
\Gamma^1_{22} = -\sin{x^1} \cos{x^1}
\]</p>
<p>and</p>
<p>\[
\Gamma^2_{21} = \Gamma^2_{22} = \cot{x^1}
\]</p>
<p>with all the other components of the connection equal to zero.</p>
<h4 id="riemann-curvature-tensor">Riemann curvature tensor</h4>
<p>Now that we have the Christoffel symbols, let’s also quickly compute the curvature tensor for the 2-sphere.
The Riemann tensor can be derived from the Levi-Civita connection calculated above,</p>
<p>\[
R^\rho_{\sigma\mu\nu} = \partial_\mu \Gamma^\rho_{\nu\sigma} - \partial_\nu \Gamma^\rho_{\mu\sigma} + \Gamma^\rho_{\mu\lambda} \Gamma^\lambda_{\nu\sigma} - \Gamma^\rho_{\nu\lambda} \Gamma^\lambda_{\mu\sigma}
\]</p>
<p>Applying the above equations, the non-zero components of the Riemann tensor are</p>
<p>\[
R^1_{212} = -R^1_{221} = \sin^2{x^1}
\]</p>
<p>and</p>
<p>\[
R^2_{112} = -R^2_{121} = -1
\]</p>
<p>So the Levi-Civita connection and the Riemann tensor are both specified for a manifold equipped with a metric.
Now to help visualize the manifold, let’s try to embed it in \(\mathbb{R}^3\).</p>
<h4 id="ricci-tensor">Ricci tensor</h4>
<p>The Ricci tensor is closely related to the Riemann tensor,</p>
<p>\[
R_{\mu\nu} = R^\lambda_{\mu\lambda\nu}
\]</p>
<p>For the 2-sphere,</p>
<p>\[
(R)_{\mu\nu} =
\begin{bmatrix}
1 & 0 \\
0 & \sin^2{x^1} \\
\end{bmatrix}
\]</p>
<h4 id="ricci-scalar">Ricci scalar</h4>
<p>Since we also know the metric, we can get the Ricci scalar,</p>
<p>\[
R = g^{\mu\nu} R_{\mu\nu}
\]</p>
<p>For the 2-sphere,</p>
<p>\[
R = \frac{2}{\rho^2}
\]</p>
<h3 id="going-the-other-way">going the other way</h3>
<p>Is it possible in general to go from the intrinsic notion of a manifold equipped with a metric to an embedding in \(\mathbb{R}^3\)?</p>
<h4 id="system-of-pdes">system of PDEs</h4>
<p>Given a manifold equipped with a metric, \(g_{\mu\nu}\), we know that</p>
<p>\[
\frac{\partial \xi^a}{\partial x^\mu} \frac{\partial \xi^b}{\partial x^\nu} \eta_{ab} = g_{\mu\nu}
\]</p>
<p>This is a system of non-linear partial differential equations.
Embedding a \(m\)-dimensional manifold in a \(n\)-dimensional space, we naievely have \(m^2\) equations and \(n\) free variables.
Of course, since we are dealing with a Riemannian metric, we have, in general, \(m(m+1)/2\) independent equations.</p>
<p>For the 2-sphere in \(\mathbb{R}^3\) and \((0,2)\) signagure described above, we have</p>
<p>\[
\left(\frac{\partial \xi^1}{\partial x^1}\right)^2 +
\left(\frac{\partial \xi^2}{\partial x^1}\right)^2 +
\left(\frac{\partial \xi^3}{\partial x^1}\right)^2 =
\rho^2
\]</p>
<p>and</p>
<p>\[
\left(\frac{\partial \xi^1}{\partial x^2}\right)^2 +
\left(\frac{\partial \xi^2}{\partial x^2}\right)^2 +
\left(\frac{\partial \xi^3}{\partial x^2}\right)^2 =
\rho^2\sin^2{x^1}
\]</p>
<p>You can check that the standard embedding of the 2-sphere satisfies this system of PDEs.
In this case we’re actually missing an equation because of the diagonal nature of \(\eta_{ab}\) and \(g_{\mu\nu}\).</p>
<p>Of course, in general, solving non-linear systems of PDEs is not easy.
In fact even <a href="https://en.wikipedia.org/wiki/Nonlinear_partial_differential_equation#Existence_and_uniqueness_of_solutions">existence and uniqueness</a> are difficult to ascertain in general.</p>
<h4 id="the-torus">the torus</h4>
<p>Let’s quickly look the case of the flat torus.</p>
<p>A lot of confusion seems to stem from the overloading of seemingly familiar terms and ideas.</p>
<p>For example, in topology, the torus is the Cartesian product of two circles \(S^1 \times S^1\).
This definition does not necessitate a metric.
The torus can be equipped with an arbitrary metric later (or given one through embedding in \(\mathbb{R}^n\)).
The typical geometric notion of a torus (the one I imagine) is the embedding in \(\mathbb{R}^3\), known as the ring torus.
This embedding artificially imposes a metric on the torus, but care should be taken not to confuse this metric as property of the topological torus.
For example, alternative embeddings of the torus are possible, such as the <a href="https://en.wikipedia.org/wiki/Clifford_torus">Clifford torus</a> in \(\mathbb{R}^4\).</p>
<p>A metric can be assigned to the torus <em>before</em> embedding.
For example, a torus can be given a flat metric that has zero curvature.
This is known as a <a href="https://en.wikipedia.org/wiki/Torus#Flat_torus">flat torus</a>.
It is understandably difficult to imagine what a flat torus would look like.
By this, I mean it is uncertain whether or not we can find an isometric embedding of the flat torus in \(\mathbb{R}^3\) where we derive most of our intuition.
Imagine bending a piece of paper, which has a flat metric, into a cylinder and then bringing the two ends of the cylinder together.
Initially, this does not seem possible without stretching the paper.
Is it possible to embed the flat torus in \(\mathbb{R}^3\)?
Does smoothness have to be sacrificed?
This is where <a href="https://en.wikipedia.org/wiki/Nash_embedding_theorem">Nash embedding theorems</a> make an appearance.</p>
<h4 id="nash-kuiper-theorem">Nash-Kuiper theorem</h4>
<p>Roughly speaking, suppose we have a Riemannian manifold \((M,g)\), the <a href="">Nash-Kuiper theorem</a> says that we can find a \(C^1\) embedding, \(f: M^m \mapsto \mathbb{R}^n \) for \(n \ge m+1\).</p>
<p>This means that there is a continuously differentiable embedding of the flat torus (a 2-dimensional Riemannian manifold) into \(\mathbb{R}^3\).
This is somewhat surprising.</p>
<p>Of course this theorem does not actually bring us any closer to finding a isometric embedding, but it is nice to know that one exists.</p>
<p>Recently, Borrelli et al. (2012) [1] demonstrated such an embedding.
The basic idea is to introduce smooth fractal corrugations into the ring torus in order to preserve the flat metric.
It is clear that embeddings of 2-dimensional Riemannian manifolds in \(\mathbb{R}^3\), while possible, is not always easy.</p>
<p><img src="/assets/images/embd-flat_torus.png" alt="flat torus in R^3" /></p>
<p>The image above image (taken from [1]) shows the first four levels of corrugations for this embedding of the flat torus in \(\mathbb{R}^3\).</p>
<p>It is clear that even in the case of the common torus equipped with the relatively benign flat metric, embedding is non-trivial.</p>
<p>Going from the extrinsic picture to the intrinsic one is obvious.
Moving from an intrinsic definition to an extrinsic one is more difficult, but the Nash embedding theorems show that it is generally possible (if you have enough dimensions to spare).</p>
<h3 id="few-questions">few questions</h3>
<p>Here are a list of questions that I’ve had along the way.</p>
<h4 id="are-c1-isometric-embeddings-of-a-compact-manifold-unique">are \(C^1\) isometric embeddings of a compact manifold unique?</h4>
<p>No.
My intuition for this is to imagine longitudinally warping a paper cylinder.</p>
<p>I’m curious to see if there are sufficient conditions on the metric that would guarantee a unique embedding in this case.
Is imposing non-zero curvature sufficient (probably not)?
I’m trying to imagine isometrically warping a paper sphere.</p>
<h4 id="what-are-the-sufficient-conditions-for-a-cinfty-embedding-an-m-dimensional-manifold-in-mathbbrm1">what are the sufficient conditions for a \(C^\infty\) embedding an \(m\)-dimensional manifold in \(\mathbb{R}^{m+1}\)?</h4>
<p>This is essentially is trying to as “what topologies and metrics can be smoothly embedded one dimension up?”
The intuition for this is that we typically imagine 2-dimensional and 3-dimensional manifolds embedded in \(\mathbb{R}^2\) and \(\mathbb{R}^3\) respectively.
What makes these objects special?</p>
<p>The 2-sphere embedded in \(\mathbb{R}^3\) above is one example of this.</p>
<p>I don’t have a good answer for this currently.
This is probably something to research if I have time.
I’d be very happy to talk if anyone has thoughts about this.</p>
<h3 id="future-work">future work</h3>
<p>I don’t really have anything planned here.
My initial goal here was just to get a better intuition for visualizing Riemannian manifolds and to obtain some intuition for curvature of the manifolds.
I didn’t expect to delve too deeply in topology and embedding theorems and it seems like I’ve just scratched the surface.
Just as a note to myself, reading through a few of the things discussed in this post has motived the topic of algebraic topology; probably should look into that eventually.</p>
<h3 id="references">References</h3>
<p>[1] V. Borrelli, S. Jabrane, F. Lazarus, and B. Thibert, “Flat tori in three-dimensional space and convex integration,” Proceedings of the National Academy of Sciences of the United States of America, vol. 109, no. 19, pp. 7218–7223, Mar. 2012.</p>
<p>[2] T. Tao, “Notes on the Nash embedding theorem,” 2016. [Online]. Available: https://terrytao.wordpress.com/2016/05/11/notes-on-the-nash-embedding-theorem/. Accessed: Feb. 24, 2017.</p>storyhydrogen atom simulation2017-01-01T10:48:00-05:002017-01-01T10:48:00-05:00/jekyll/update/2017/01/01/hydrogen-atom-simulation<h3 id="story">story</h3>
<p>Being able to numerically simulate hydrogen orbitals has been a longstanding goal of mine.
I thought this would a relatively simple task – translating the standard calculation into an eigenvector problem, but I’ve hit several numerical roadblocks along the way.
I’ve finally found a few workarounds to be able to get meaningful results.</p>
<p>Disclaimer: due to my general stubbornness, I have refused to read up on how these orbitals are usually computed – so there may very well be a better way to do this.</p>
<h3 id="d-potentials">1D potentials</h3>
<p>Let’s start with 1D potentials and the standard time independent Schrodinger equation.</p>
<p>\[
\left(\hat{K} + \hat{V} \right) |\psi\rangle = E |\psi\rangle
\]</p>
<p>where \(\hat{K}\) and \(\hat{V}\) are the kinetic and potential energy operators respectively.</p>
<p>Typically one analytically solves for the eigenkets \(|\psi_n\rangle\) and eigenvalues \(E_n\) of the above equation.</p>
<p>For example, in the standard the standard \(x\) basis, the equation becomes</p>
<p>\[
\left(\frac{-\hbar^2}{2m}\frac{\partial^2}{\partial x^2} + V(x)\right) \psi(x) = E\psi(x)
\]</p>
<p>where \(V(x)\) is the potential energy with dependence only on \(x\).</p>
<p>Using this continuous basis, however, is not ideal since we are trying to solve the problem numerically.
Let’s instead write the equation in a approximate discrete basis where \(\psi\) becomes a vector and the operators become matrices.
Essentially we’re approximating \(\psi(x)\) with \(\psi(x_k)\) with \(k\in\{1,2,…,n\}\).
In this basis, \(\hat{K}\) and \(\hat{V}\) become \(K\in S^n\) and \(V\in S^n\) respectively, where \(S^n\) is the set of symmetric \(n \times n\) matrices.</p>
<p>More concretely, \(K = \frac{-\hbar^2}{2m} D^n_2\) where \(D^n_2\) is the \(n \times n\) <a href="https://en.wikipedia.org/wiki/Finite_difference_coefficient">finite difference matrix</a> for the second derivative and
\(V = diag(V(x_1),V(x_2),…,V(x_n))\) – that is a diagonal matrix with the discretely sampled potential function along the diagonal.</p>
<p>This becomes the following matrix equation</p>
<p>\[
(K+V)\psi_n = E_n\psi_n
\]</p>
<p>where \(\psi_n\) and \(E_n\) are the eigenvectors and eigenvalues of \(K+V\) respectively.</p>
<p>I constructed \(K\) and \(V\) as described for a few different potentials.
For numerical stability (and since our matrices are symmetric), the SVD of the matrix \(K+V\) was computed in order to solve for eigenvectors.</p>
<p>Below is an example of the infinite potential well.</p>
<p><img src="/assets/images/hsym-inf_well.png" alt="1D infinite potential well" /></p>
<p>The black line indicates the potential, and the coloured lines are the first few eigenfunctions ordered by energy.
Note that the vertical spacing is arbitrary, but is used to indicate relative energy spacing.</p>
<p>But of course, this is overkill for the infinite potential well where the closed form solutions are already well known.
Let’s apply it to some weird potentials for fun!</p>
<p>Here’s a harmonic oscillator potential inside an infinite well.</p>
<p><img src="/assets/images/hsym-weird1.png" alt="weird potential 1" /></p>
<p>Here’s a sawtooth inside an infinite well.</p>
<p><img src="/assets/images/hsym-weird2.png" alt="weird potential 2" /></p>
<p>For now, I’m limited to wavefunctions that go to zero at the edge of the simulation domain due to some numerical issues.
This is why the potentials above are all within an infinite well.</p>
<h3 id="d-central-potentials">3D central potentials</h3>
<p>Now let’s look at the 3D time independent Schrodinger equation with a central potential \(\hat{V}(r)\) in spherical coordinates.</p>
<p>\[
\left( \frac{-\hbar^2}{2\mu} \nabla^2 + V(r) \right) \psi = E\psi
\]</p>
<p>Fully writing out the Laplacian,</p>
<p>\[
\left[ \frac{-\hbar^2}{2\mu} \left( \frac{1}{r^2}\frac{\partial}{\partial r} r^2 \frac{\partial}{\partial r} + \frac{1}{r^2 \sin\theta} \frac{\partial}{\partial\theta} \sin\theta \frac{\partial}{\partial\theta} + \frac{1}{r^2 \sin^2\theta } \frac{\partial^2}{\partial\phi^2} \right) + V(r) \right] \psi = E\psi
\]</p>
<p>Let’s separate \(\psi_{n,l,m}(r,\theta,\phi) = R_n(r) Y^m_l(\theta,\phi)\) where \(Y^m_l(\theta,\phi)\) are eigenfunctions of the angular part of the Laplacian.
That is, it satisfies</p>
<p>\[
\nabla^2Y^m_l(\theta,\phi) = -\frac{l(l+1)}{r^2}Y^m_l(\theta,\phi)
\]</p>
<p>Making the substitution and rearranging</p>
<p>\[
\left[ -\frac{\hbar^2}{2\mu}\left( \frac{1}{r^2} \frac{\partial}{\partial r} r^2 \frac{\partial}{\partial r} - \frac{l(l+1)}{r^2} \right) + V(r) \right] R_n = E_nR_n
\]</p>
<p>Making the additional substitution \(U_n(r) = rR_n(r)\),</p>
<p>\[
\left[ -\frac{\hbar^2}{2\mu} \left( \frac{\partial^2}{\partial r^2} - \frac{l(l+1)}{r^2} \right) + V(r) \right] U_n = E_nU_n
\]</p>
<p>This is in the same form as the standard 1D equation with an effective potential of \(V_{eff}(r) = V(r) + \frac{\hbar^2l(l+1)}{2\mu r^2}\).
This additional term is known as the angular momentum barrier.</p>
<p>Now that we’ve reduced the equation to the 1D problem, we can use the same solver I used above on the following equation.</p>
<p>\[
\left[ -\frac{\hbar^2}{2\mu} \frac{\partial}{\partial r^2} + V_{eff}(r) \right] U_n = E_n U_n
\]</p>
<h4 id="spherical-harmonics">spherical harmonics</h4>
<p>One big thing that I’ve skipped over here is the existence of the \(Y^m_l(\theta,\phi)\) used above.
Luckily, these are known as the <a href="https://en.wikipedia.org/wiki/Spherical_harmonics">spherical harmonics</a> and are well known.</p>
<p>I decided to cheat here a bit.
Instead of numerically solving for the angular component of the equation, I’ve precomputed the solution to the angular component of the spherical harmonics.</p>
<h4 id="radial-equation-with-coulomb-potential">radial equation with Coulomb potential</h4>
<p>With the angular part precomputed and taken care of, it should be relatively simple to solve for the radial component (it wasn’t).
Using the technique described above, this problem can be reduced to a simple 1D problem in \(U_n(r)\) with effective potential given by</p>
<p>\[
V_{eff}(r) = -\frac{e^2}{4\pi \epsilon_0} \frac{1}{r} + \frac{\hbar^2l(l+1)}{2\mu r^2}
\]</p>
<p>For simplicity, let’s nondimensionalize the equation.
By substituting \(r=a_0\rho=\frac{4\pi \epsilon_0 \hbar^2}{\mu e^2}\rho\) and \(E_n=\frac{\hbar^2}{2\mu a_0^2}\epsilon_n=\frac{\mu e^4}{32\pi^2\epsilon_0^2\hbar^2}\epsilon_n\) we have,</p>
<p>\[
\left[ -\frac{\partial^2}{\partial \rho^2} + \frac{l(l+1)}{\rho^2} - \frac{2}{\rho} \right]U_n = \epsilon_n U_n
\]</p>
<p>where \(a_0\) is the <a href="https://en.wikipedia.org/wiki/Bohr_radius">Bohr radius</a>.
This equation can be solved using the technique described above as long as it is placed within an infinite potential well (for numerical reasons).
The infinite barriers are placed far enough away that they do not affect the results.</p>
<p>The radial probability distributions (proportional to \(\rho^2 |R(\rho)|^2\)) for \(n=3\) are plotted below.</p>
<p><img src="/assets/images/hsym-radial.png" alt="hydrogen n=3 radial" /></p>
<p>Combining the radial solution with the spherical harmonics, we obtain the full solution.
Plotted below are the the xz-plane cross sections of \(|\psi_{n,l,m}|\) for a seleciton of the \(n=3\) wavefunctions below</p>
<p><img src="/assets/images/hsym-orbitals.png" alt="hydrogen n=3 orbitals" /></p>
<p>This technique also allows us to look at the energies of the orbitals.
Plotted below are the computed relative energy levels for different \(n\)’s and \(l\)’s.</p>
<p><img src="/assets/images/hsym-energies.png" alt="hydrogen n<=3 energies" /></p>
<p>The energy is determined completely by \(n\) (as <a href="https://en.wikipedia.org/wiki/Hydrogen_atom#Bohr-Sommerfeld_Model">expected</a>).
This is more degeneracy than can be explained by simple rotational symmetry, which only results in a degeneracy of \(2l+1\).
This additional \(n^2\) degeneracy is well known and unique to the Coulomb potential and comes from the ‘accidental’ conservation of an additional quantity known as the <a href="https://en.wikipedia.org/wiki/Laplace%E2%80%93Runge%E2%80%93Lenz_vector">Laplace-Runge-Lenz vector</a>.
This extra degeneracy does not appear in reality (this is shown in the standard orbital energy diagrams).
The splitting of this degeneracy comes from <a href="https://en.wikipedia.org/wiki/Fine_structure">fine structure corrections</a> that I will try to calculate in the future.</p>
<h3 id="future-work">future work</h3>
<p>I am currently limited to bound states that fall to zero near the simulation domain boundaries due to issues with the finite difference matrix near the edges.
As a workaround, I’ve been artificially imposing an infinite well on all of the bound potentials (see above).
This should not have a large effect on the simulation since the infinite walls are placed far from where the electron would be expected to be found, but it would be nice to have a more elegant solution to this problem.</p>
<h4 id="fine-structure-corrections">fine structure corrections</h4>
<p>One of the main motivations for doing this was to be able to numerically compute how different corrections to the Coulomb potential would split the degeneracies in the spectrum.
It is well known that the Coulomb potential has an extra <a href="http://iopscience.iop.org/article/10.1088/0305-4470/21/11/014">‘hidden’ \(O(4)\) symmetry</a> that results in extra degeneracies.
Adding corrections due to relativistic effects can split this extra symmetry.
This is pretty high on my list of things to do.</p>
<p>The ultimate goal would be a fully general stationary state solver given an arbitrary Hamiltonian, but this is probably overkill.</p>storywhat are spinors?2016-12-06T05:22:00-05:002016-12-06T05:22:00-05:00/jekyll/update/2016/12/06/what-are-spinors<h3 id="story">story</h3>
<p>I feel like this is a common question asked by students that have just reached the spin chapter of their undergraduate quantum mechanics text (this is me).
I’ve been using Shankar (2E) which has been great, but mysterious footnotes appear to be pointing to something more fundamental under the surface.
Quick internet searches have shown that what I’m looking for lies in Lie algebras (;.
I haven’t posted in a while and thought that a blog post would be a good way to organize my thoughts.</p>
<p>In this post, I’m going to try to describe my intuitive understanding.
I will then try to relate the intuition to the Lie algebra formalism.
Because of this, this post will lack rigor, but that can be found elsewhere.</p>
<h3 id="first-attempt">first attempt</h3>
<h4 id="rotations-in-mathbbr3">rotations in \(\mathbb{R}^3\)</h4>
<p>Firstly, let’s deal with the familiar vector space \(\mathbb{R}^3\).
Rotations in \(\mathbb{R}^3\) can be parameterized by three continious variables and in general can be represented by a composition of three \(3\times 3\) matrices.</p>
<p>For simplicity, let’s deal with the typical <a href="https://en.wikipedia.org/wiki/Rotation_matrix">rotation matrices</a>.
For example, a matrix representing a rotation by angle \(\theta\) about the \(x\)-axis will be denoted \(R_x(\theta)\).</p>
<p>What makes these matrices rotation matrices?
Let’s look at the commutation relationship of these matrices.</p>
<p>Carrying out the matrix multiplication</p>
<p>\[
R_x(\theta)R_y(\phi)-R_y(\phi)R_x(\theta) =
\begin{bmatrix}
0 & -\sin\theta\sin\phi & \sin\phi-\cos\theta\sin\phi \\
sin\theta\sin\phi & 0 & \sin\theta-\cos\phi\sin\theta \\
sin\phi-\cos\theta\sin\phi & \sin\theta-\cos\phi\sin\theta & 0 \\
\end{bmatrix}
\]</p>
<p>Let’s approximate this to the second order, assuming infinitesimal rotations \(\theta=\epsilon_x\) and \(\phi=\epsilon_y\)</p>
<p>\[
R_x(\epsilon_x)R_y(\epsilon_y)-R_y(\epsilon_y)R_x(\epsilon_x) =
\begin{bmatrix}
0 & -\epsilon_x\epsilon_y & 0 \\
\epsilon_x\epsilon_y & 0 & 0 \\
0 & 0 & 0 \\
\end{bmatrix}
=
R_x(\epsilon_z) - I
\]</p>
<p>Now, let’s also approximate the matrices on the left to the first order.
For example,</p>
<p>\[
R_x(\epsilon_x) = I + \epsilon_x
\begin{bmatrix}
0 & 0 & 0 \\
0 & 0 & -1 \\
0 & 1 & 0 \\
\end{bmatrix}
\equiv I + \epsilon_xT_x
\]</p>
<p>where \(T_x\) is called the generator of infinitesimal rotations about the \(x\)-axis.</p>
<p>Any general rotation can be written as product of infinitesimal ones.</p>
<p>Combining the above two equations, and repeating for permutations of \(x\), \(y\) and \(z\) results in the following commutation relations.</p>
<p>\[
\left[T_i,T_j\right] = \sum_k \varepsilon_{ijk}T_k
\]</p>
<p>where \(\varepsilon_{ijk}\) is the <a href="https://en.wikipedia.org/wiki/Levi-Civita_symbol">Levi-Civita symbol</a>.</p>
<p>The above commutation relationship captures the essence of rotations in \(\mathbb{R}^3\).</p>
<h4 id="representation-as-quantum-operators">representation as quantum operators</h4>
<p>Let’s try to find the quantum operators for spin.</p>
<p>The angular momentum operators, whether orbital or spin, are generators of rotations.
This statement should make some intuitive sense, and the proof can be easily found <a href="https://en.wikipedia.org/wiki/Angular_momentum_operator#Angular_momentum_as_the_generator_of_rotations">elsewhere</a>.</p>
<p>Because of this, what we have derived for the generators of rotations in \(\mathbb{R}^3\) should also hold for our spin angular momentum operators.</p>
<p>We will take out a factor of \(2i\) by convention.
This can easily be done since we can define the new generators to absorb this constant.
We now have</p>
<p>\[
\left[\sigma_i,\sigma_j\right] = \sum_k 2i\varepsilon_{ijk}\sigma_k
\]</p>
<p>In addition to this, we require the generators to be Hermitian.
This condition (along with the factor of \(i\) in front) is necessary to ensure the unitarity of the resulting quantum rotation operator.</p>
<p>Let’s assume that we’ve found such a set of generators (a few more conditions are needed by convention to fully fix these generators).
Sprinkling in some dimension with carefully placed \(\hbar\)’s turn these dimensionless generators into the spin operators we’re seeking.</p>
<p>\[
S_k = \frac{\hbar}{2} \sigma_k
\]</p>
<h4 id="pauli-matrices">Pauli matrices</h4>
<p>There is one problem remaining that is preventing us from concretely expressing these generators.
What are the dimensions of our generator matrices?
To answer this, we will need to study a real world particle, and introduce some experimental facts.
Let’s stick with the electron.
The <a href="https://en.wikipedia.org/wiki/Stern%E2%80%93Gerlach_experiment">Stern-Gerlach</a> experiment indicates that an electron has two observable spin states.
A condition which is known as spin-\(1/2\) (this is not important for this discussion, but should be clear if you’ve read up on the topic).</p>
<p>Because of this, we will use the following representation to describe the wavefunction of an electron</p>
<p>\[
|\Psi\rangle =
\begin{bmatrix}
|\Psi_+\rangle \\
|\Psi_-\rangle \\
\end{bmatrix}
\]</p>
<p>where \(|\Psi_+\rangle\) and \(|\Psi_-\rangle\) represent the component of the wavefunction with definite spin of \(s_z=+\hbar/2\) and \(s_z=-\hbar/2\) respectively (\(z\) is chosen out of convention).</p>
<p>Based on this representation, it is clear that</p>
<p>\[
S_z = \frac{\hbar}{2}
\begin{bmatrix}
1 & 0 \\
0 & -1 \\
\end{bmatrix}
= \frac{\hbar}{2}\sigma_z
\]</p>
<p>Now all that’s left is to find \(S_x\) and \(S_y\) that satisfy the commutation relation.
The <a href="https://en.wikipedia.org/wiki/Pauli_matrices#Commutation_relations">Pauli matrices</a> \(\sigma_x\), \(\sigma_y\) and \(\sigma_z\) (up to a factor of \(\hbar/2\)) are just such a set of matrices!</p>
<h4 id="eigenfunctions-of-spin-12">eigenfunctions of spin-\(1/2\)</h4>
<p>Now what about measuring spin in an arbitrary direction?
Multiply by a unit vector \(\hat{n}=n_x\hat{i}+n_y\hat{j}+n_z\hat{k}\).</p>
<p>\[
S_\hat{n} = n_xS_x + n_yS_y + n_zS_z
\]</p>
<p>in the usual spherical coordinates, this results in</p>
<p>\[
S_\hat{n} =
\begin{bmatrix}
\cos\theta & \sin\theta e^{-i\phi} \\
\sin\theta e^{i\phi} & -\cos\theta \\
\end{bmatrix}
\]</p>
<p>the eigenfunctions of which are
\(
|\hat{n},+\rangle =
\begin{bmatrix}
\cos(\theta/2) e^{-i\phi/2} \\
\sin(\theta/2) e^{i\phi/2} \\
\end{bmatrix}
\)
and
\(
|\hat{n},-\rangle =
\begin{bmatrix}
-\sin(\theta/2) e^{-i\phi/2} \\
\cos(\theta/2) e^{i\phi/2} \\
\end{bmatrix}
\)</p>
<p>Now it can easily be seen that it takes a rotation of \(2\pi\) introduces a minus sign to the eigenkets and it takes a rotation of \(4\pi\) in order to return the kets to their original position.</p>
<h4 id="higher-spin">higher spin</h4>
<p>The experimental fact that the spin angular momentum of electrons takes one of two values resulted in the above representation of the spin operators as \(2\times 2\) matrices.</p>
<p>The same commutation rules apply for deriving spin operators for higher spin particles.
The only difference is a different matrix representation is needed.</p>
<p>As an example, for spin-\(1\) particles,</p>
<p>\[
|\Psi\rangle =
\begin{bmatrix}
|\Psi_+\rangle \\
|\Psi_0\rangle \\
|\Psi_-\rangle \\
\end{bmatrix}
\]</p>
<p>The spin operators for spin-\(1\) particles are
\(
\frac{2}{\hbar} S_x = \sqrt{2}
\begin{bmatrix}
0 & 1 & 0 \\
1 & 0 & 1 \\
0 & 1 & 0 \\
\end{bmatrix}
\)
,
\(
\frac{2}{\hbar} S_y = \sqrt{2}
\begin{bmatrix}
0 & -i & 0 \\
i & 0 & -i \\
0 & i & 0 \\
\end{bmatrix}
\)
and
\(
\frac{2}{\hbar} S_z = 2
\begin{bmatrix}
1 & 0 & 0 \\
0 & 0 & 0 \\
0 & 0 & -1 \\
\end{bmatrix}
\)
it can be verified that these matrices satisfy the commutation relation.</p>
<p>Notice that there are not enough constraints in the commutation relations to uniquely specify these spin operators.
Any set of matrices that satisfy these commutation relationships would produce the same physics – this is simply a common representation.</p>
<p>Using the same logic, we can derive an expression for the spinor eigenfunctions.
I will not because it is too much effort.</p>
<h3 id="lie-algebra-formalism">Lie algebra formalism</h3>
<p>Pronounced “lee” (disappointing, I know).</p>
<p>I will not try to derive the results above using this formalism.
This section is meant to informally relate the above ideas to Lie algebras and groups.</p>
<h4 id="lie-group">Lie group</h4>
<p>The notion of a Lie group is actually quite intuitive.
Loosely, it’s a group whose elements are parameterized by continious and differentiable variable(s).</p>
<p>The set of rotation matrices in \(\mathbb{R}^3\) (should be simple to verify they satisfy the group axioms) form the Lie group known as \(O(3)\).
The set of proper rotations being the subgroup \(SO(3)\).
One possible parameterization being the usual \(\theta_x\), \(\theta_y\) and \(\theta_z\), for example.</p>
<p>These groups are often used to describe symmetries, for example \(SO(3)\) can be seen as the symmetry group that preserves the Euclidean metric.</p>
<p>Similarly, the set of \(2\times2\) unitary matrices with determinant \(1\) is one realization of \(SU(2)\).
In fact, this is the form that \(SU(2)\) takes in the basis introduced for describing spin-\(1/2\) particles.
In this basis, \(SU(2)\) is generated by the Pauli matrices.
Note that this is just one realization of the group \(SU(2)\).
It takes a different form when working with higher spin particles as shown above.</p>
<h4 id="lie-algebra">Lie algebra</h4>
<p>A Lie algebras can be defined as the vector space formed by the infinitesimal generators of a Lie group.
In other words, a Lie algebra is the tangent space of a Lie group near the identity.
Because of the relation with a Lie group, they come with a bilinear operator (the commutator) which makes the tangent vector space a Lie algebra.</p>
<p>The Lie algebras corresponding to \(SO(3)\) and \(SU(2)\) are \(\mathfrak{so}(3)\) and \(\mathfrak{su}(2)\) respectively.
It turns out that the Lie algebras \(\mathfrak{so}(3)\) and \(\mathfrak{su}(2)\) are isomorphic.</p>
<p>Just as we can define a Lie algebra from a Lie group, Lie’s third theorem states that we can go the other way – from Lie algebra to Lie group (with a few mathematical caveats).</p>
<p>Since the Lie algebras are isomorphic, the associated Lie groups \(SO(3)\) and \(SU(2)\) are related up to a <a href="https://en.wikipedia.org/wiki/Covering_group">covering</a>.
We have seen that \(SU(2)\) double covers \(SO(3)\) – that is, there is a 2-to-1 surjective mapping between the two groups.</p>
<h3 id="summary">summary</h3>
<p>To summarize, first, we derived the commutation relations for the familiar infinitesimal generators of rotations in \(\mathbb{R}^3\).
By virtue of the fact that angular momentum is the generator of rotations, we argued that they must have the same commutation relationships.
The spin operators could then be derived, and the spin eigenfunctions found.
This showed explicitly the 2-to-1 mapping between the wavefunction spinor and the angular momentum vector.</p>
<p>From the perspective of Lie algebras, equating the commutation relations of the infinitesimal generators of rotations implied an isomorphic underlying Lie algebra.
This implies that the associated Lie groups are isomorphic up to a covering.
It turns out that \(SU(2)\) double covers \(SO(3)\) resulting in spinors which have a 2-to-1 mapping to angular momentum vectors.</p>storycounting number of binary trees with L leaves – extending the Catalan numbers2016-11-13T01:48:00-05:002016-11-13T01:48:00-05:00/jekyll/update/2016/11/13/counting-binary-trees<h3 id="story">story</h3>
<p>I initially became interested in this problem while talking with a friend.
I don’t remember how we got on this topic.</p>
<p>The number of trees with \(L\) leaves is ill-defined; since, without imposing additional constraints, there are an infinite number of such trees.
Two potentially useful additional constraints come to mind: limiting the total number of nodes and limiting the height.
This post was rushed a bit because I wanted to test this new blogging platform.
As a result, this post will only focus on the first constraint and impose the additional constraint that the tree be binary.</p>
<h3 id="problem">problem</h3>
<p>How many unique binary trees are there with \(N\) nodes and \(L\) leaves?</p>
<h3 id="solution">solution</h3>
<p>The solution to this problem is most easily described by a recurrence relation.</p>
<p>\[
C_{N,L} = \sum_{0 \le k \le N-1} \sum_{0 \le b \le L} C_{k,b}C_{N-k-1, L-b} + \delta_{N,0}\delta_{L,0} + \delta_{N,1}\delta_{L,1}
\]</p>
<p>for \(N \gt L \gt 0\), \(N=L=0\) and \(N=L=1\) and \(C_{N,L} = 0\) otherwise.</p>
<p>In the above equation, \(\delta\) is the Kronecker delta. The \(\delta_{N,0}\delta_{L,0}\) term represents the empty tree and \(\delta_{N,1}\delta_{L,1}\) term represents the tree with one node.</p>
<p>I do not have a proof of correctness beyond what is implicit in the formulation above, but there are other ways to gain confidence in this answer.</p>
<p>Firstly, it is fortunate that the number of binary trees with \(N\) nodes (without the constraint on number of leaves) has a well known solution.
Namely, it is given by the <a href="https://oeis.org/A000108">Catalan numbers</a> which is defined using a similar recurrence relationship.</p>
<p>The Catalan numbers, \(C_N\) are given by the following formula:
\[
C_N = \frac{1}{N+1} \binom{2N}{N}
\]</p>
<p>It is obvious that the following relationship should hold,
\[
C_N = \sum_{0 \le b \le N} C_{N,b}
\]</p>
<p>This has been verified for up to \(N=100\).
There is probably a simple inductive proof that can be made here.</p>
<p>It is also easy to verify that it is correct for \(L=1\) producing the expected \(C_{N,1} = 2^{N-1}\).</p>
<h3 id="discussion">discussion</h3>
<p>The Catalan numbers have many applications in combinatorics, this extention to the Catalan numbers may also be useful in some of those counting problems (by mapping problems for which the number of leaves has a meaningful image).</p>
<p>This recurrence relationship can be easily extended to a general n-ary tree using something similar to</p>
<p>\[
C_{N,L} = \sum_{p_i \in P} \sum_{b_i \in B} \prod_{i=0}^{I} C_{p_i,b_i} + \delta_{N,0}\delta_{L,0} + \delta_{N,1} \delta_{L,1}
\]</p>
<p>Where \(P\) and \(B\) are the set of <a href="https://en.wikipedia.org/wiki/Composition_(combinatorics)">weak compositions</a> of length \(I\) of \(N\) and \(L\) respectively.
Where \(I\) determines the arity of the tree.</p>
<p>It would be interesting to develop an explicit formula for \(C_{N,L}\) perhaps by making use of generating functions.</p>
<p>I limited the scope of this problem since this post was rushed.
I’m interested in tackling a few of the questions mentioned above in the future.</p>
<h3 id="notes">notes</h3>
<ul>
<li>all terms up to \(N=100\) can be found <a href="/assets/data/counting-bintrees-100.txt">here</a></li>
</ul>story