# Tangent Vectors (Part I)

As discussed in our differential geometry course, a vector field, $X$, on a differentiable manifold, $M$, is a correspondence that associates to each point $p \in M$ a tangent vector $X(p) \in T_p M$.

This definition is taken verbatim from Do Carmo (page 25), and I think its worth noting that the words “function, map,” and/or “mapping” aren’t used in this first, defining sentence.  These words were not used in our differential geometry class either, at least not when discussing vector fields or differential forms.  Terms like “function” require that we specify a domain and co-domain, and, in this instance, doing so requires yet more definitions.

To make sense of vector fields, we first need to understand tangent vectors, which, along with tangent spaces, will be the primary focus of this post.  The next post will focus on the tangent bundle and, finally, vector fields.   Although a tangent vector can be defined in many different ways, the most accessible or familiar is also the least sensible; the common, “easy” definition is not applicable, but it nonetheless makes for an excellent starting point.

Failed-Def. (Tangent Vector)  A tangent vector to a point $p \in M$ is the velocity vector $\alpha'(0) \in T_pM$ of some curve $\alpha : I \to M$ with $\alpha(0) = p$.

This definition works perfectly fine when $M \subseteq \mathbb{R}^n$.  When $M$ is an abstract manifold, though, we do not know how to differentiate the outputs of a curve $\alpha : I \to M$.  Even if we use the charts and local $\mathbb{R}^n$ structure, different charts / coordinate patches will result in different output functions and, hence, different derivatives.

As discussed in our Differential Geometry class, tangent vectors are used to directionally-differentiate functions $f:U \subseteq \mathbb{R}^n \to \mathbb{R}$.  Indeed, they appear so frequently in this context that one is tempted to define them by this property, and that is precisely how we proceed in our current setting.  This would be a terrible definition to use in, say, a Vector Calculus course, where notions of multivariable differentiation are new, but it is nonetheless valid and useful.

Def. 1. (Tangent Vector)  A tangent vector at a point $p \in M$ is a function with domain $\mathcal{D} = \{\text{all functions differentiable near } p \in M\}$ and codomain $\mathbb{R}$; this function is obtained from a curve $\alpha : I \to M$ (with $\alpha(0) = p$) by mapping

$\displaystyle f \mapsto \frac{d}{dt} \, \left(f\circ\alpha\right) \Big{|}_{t=0}.$

This generalizes the use of a vector $\vec{v} \in \mathbb{R}^n$ in the definition of the directional derivative of $f:\mathbb{R}^n\to\mathbb{R}$.  Indeed, in Vector Calculus, the arbitrary curve $\alpha$ is replaced by the straight line $\ell(t) = p + t\vec{v}$.  Using the Chain Rule, one can show that if $\alpha$ and $\beta$ are any two curves with $\alpha(0) = p = \beta(0)$ and $\alpha'(0) = \vec{v} = \beta'(0)$ then

$\displaystyle \frac{d}{dt} \left(f\circ\alpha\right) \Big{|}_{t=0} = \nabla f(p )\cdot\vec{v}= \frac{d}{dt} \left(f\circ\beta\right)\Big{|}_{t=0}.$

The intermediate expression — $\nabla f(p )\cdot\vec{v}$ — is not available to us in our abstract setting.  After all, we are trying to define the concept of the tangent vector $\vec{v}$, and we have not (yet) developed a notion of the gradient of a function $f:M\to\mathbb{R}$.  That being said, this expression highlights how an old tangent vector $\vec{v} \in \mathbb{R}^n$ acts on a function to produce a real number: dot it with the gradient.  Rewriting this process as a derivative of a composition leads us to Definition 1.

For what its worth, I think that the truly bizarre part of this is that we are no longer defining an object by what it “is” — as an anchored arrow with length and direction in the old $\mathbb{R}^n$ case, for instance — but by what it “does,” a mapping that eats functions in a specified way.  Fortunately, we will have many opportunities to get used to this, for many of the geometric objects we will meet in this course can (and should) be introduced in such a way.

We denote a tangent vector $\vec{v}$ by the familiar notation $\vec{v}= \alpha'(0)$ when it arises via the curve $\alpha$, and it can be a bit strange to keep in mind that this now denotes a function that “eats a real-valued function $f$” and “spits out” a real number.  In particular, it now makes sense for us to write $\alpha'(0)\,(f) = 7$.

One should note that tangent vectors eat (locally) differentiable functions in a linear way and in accordance with the product / Leibniz rule of differentiation.  This follows from the differentiation of $f\circ\alpha$ and the standard facts for derivatives learned in Calculus.

As explained by Do Carmo (pages 7-8), one can use local coordinates near $p \in M$ to rephrase exactly how a tangent vector $\alpha'(0)$ devours a real-valued function:

$\displaystyle \alpha'(0)\,(f) = \sum_{i=1}^n \! x_i'(0) \, \frac{\partial f}{\partial x_i}$

so that the tangent vector $\alpha'(0)$ acts like the (local) differential operator

$\displaystyle \alpha'(0) \leftarrow\rightarrow \sum_{i=1}^n \! x_i'(0) \, \frac{\partial}{\partial x_i}$.

(The expressions $x_i(t)$ arise from the local coordinates on the curve $\vec{x}^{-1}\circ\alpha$.)  This observation leads to our second and third ways to define a tangent vector:

Def. 2. (Tangent Vector) A tangent vector at $p \in M$ is a function with domain $\mathcal{D}$ and co-domain $\mathbb{R}$ that, in local coordinates $\vec{x}:\mathbb{R}^n\to M$ near $p \in M$ has the form

$\displaystyle \sum_{i=1}^n \! a_i \frac{\partial}{\partial x_i}$

where the coefficients $a_i$ are real numbers.

Def. 3. (Tangent Vector) A tangent vector at $p \in M$ is a function with domain $\mathcal{D}$ and co-domain $\mathbb{R}$ that, in local coordinates $\vec{x}:\mathbb{R}^n\to M$ near $p$, can be expressed as a linear combination of the functions

$\displaystyle \left\{ \frac{\partial}{\partial x_1}, \frac{\partial }{\partial x_2}, \dots, \frac{\partial }{\partial x_n} \right\}$.

These alternate definitions prove extremely useful (as does almost any local-coordinate definition), but Do Carmo glosses over their equivalence to Definition 1.  Based on Section 2 of his book, it should be clear that every Def. 1-tangent vector, $\vec{v} = \alpha'(0)$, gives rise to a Def. 2-tangent vector.  However, the converse isn’t necessarily so clear or obvious.  That is, given an arbitrary expression

$\displaystyle \sum_{i=1}^n \! a_i \frac{\partial}{\partial x_i}$

how does one produce a curve $\alpha : I \to M$ with $\alpha(0) = p$ and so that

$\displaystyle \alpha'(0)(f) = \sum_{i=1}^n \! a_i \frac{\partial f}{\partial x_i}$?

In fact, there are infinitely many curves one could use for $\alpha$, but we need only determine one.  Fortunately, there is a standard way to do this, and, of course, it uses local coordinates.  Without loss of generality, suppose $0 \in \mathbb{R}^n$ is the pre-image of $p \in M$ under the coordinate chart $\vec{x}$ so that $\vec{x}(0) = p$.  We first construct a straight line in $\mathbb{R}^n$ that passes through $0$ and heads in the direction of the vector $(a_1, a_2, \dots, a_n)$; this is accomplished using the line

$\ell(t) = 0 + t\left(a_1, a_2, \dots, a_n\right).$

We can then set $\alpha(t) = \left(\vec{x}\circ\ell\right)(t)$.  It then follows $\alpha(0) = \left(\vec{x}\circ\ell\right)(0) = \vec{x}(0) = p$.  That $\alpha'(0)(f)$ satisfies the desired equation also follows from a straightforward computation:

$\displaystyle \alpha'(0)(f) = \frac{d}{dt} \left( f\circ \alpha \right) \Big{|}_{t=0} = \frac{d}{dt} \! f\left(x_1(t), x_2(t), \cdots, x_n(t)\right) \Big{|}_{t=0} = \frac{d}{dt} f\left(\,ta_1, ta_2, \dots, ta_n\right)\Big{|}_{t=0}$

and this expression simplifies to

$\displaystyle \sum \! a_i \frac{\partial f}{\partial x_i}$

as desired.

Because the domain $\mathcal{D}$ is difficult to visualize or intuitively understand, it can be challenging to wrap one’s head around these ideas of tangent vectors.  When $M = \mathbb{R}^n$, a tangent vector at $p \in M$ can be understood as a “push” located at the point $p$, and when these “pushes” are applied to the inputs of a function, its outputs change at a particular rate.  Both the “force” (aka size) and direction of these “pushes” affect the change in our function, and, thankfully, the exact same interpretation is available to us in this more abstract setting: we think of each operator $\partial / \partial x_i$ as a “push” located at the point $p \in M$.

These “pushes” can move the inputs of a function $f:M\to\mathbb{R}$ through other inputs from $M$, and in local $\left(x_1, x_2, \dots, x_n\right)$ coordinates, the “standard pushes” $\partial / \partial x_i$ correspond to “pushes” parallel to the Euclidean directions at an arbitrary point $p \in M$.  This is depicted in Figure 3 on page 8 of Do Carmo (shown to the left in this post).  Of course, in the general setting, we cannot actually visualize $M$, both because its dimension may be too high and because it may be given as an abstract, not-embedded-in-ambient-3-space manifold.

These observations lead naturally to the definition of the tangent space at $p \in M$:

Def. (Tangent Space) The tangent space at $p \in M$ is the vector space of all tangent vectors at $p$; it is denoted by $T_pM$ and, if $\vec{x}$ is a local parameterization near $p$ with coordinates $(x_1, \dots, x_n)$, then $T_pM$ has as a basis the $n$ operators $\partial / \partial x_i$.

Usually one defines $T_pM$ to be the set of all tangent vectors at $p \in M$ and then proves that it has a vector space structure.  Of course, in explaining how every Def.-1 tangent vector gives rise to a Def. 2- and Def. 3-tangent vector (and, more importantly, vice-a-versa), we do not need to proceed along these lines.

Note that any local coordinates $x_i$ near $p \in M$ provide us with a basis for $T_pM$, and so if $y_i$ is another set of such coordinates, then we have two different bases for the tangent space $T_pM$:

$\displaystyle \text{span}\left\{\frac{\partial}{\partial x_1}, \frac{\partial}{\partial x_2}, \dots, \frac{\partial}{\partial x_n}\right\} = T_pM = \text{span}\left\{\frac{\partial}{\partial y_1}, \frac{\partial}{\partial y_2}, \dots, \frac{\partial}{\partial y_n}\right\} .$

Linear algebra tells us that the two bases differ by matrix multiplication, and it should come as no surprise that the derivative matrix of the transition function $\vec{y}^{-1}\circ\vec{x}$ (evaluated at the pre-image of $p \in M$) is the change-of-basis matrix.  Said slightly differently, to change coordinates we use transition functions, and to change tangent space bases we use the differential of transition functions.

## Some Examples

Example 1.  Let us determine $T_pS^2$ with $p = (0, 0, 1)$ in a couple of ways.  First, we can do so entirely visually, using “old school” developed in Vector Calculus.  Visually speaking, the tangent space to the unit sphere at the north pole should be a horizontal plane with unit normal $(0, 0, \pm 1)$.  All of the tangent vectors in this plane are of the form $\left(a_1, a_2, 0\right)$.

We can also describe $T_pS^2$ using the local coordinates of example 4.6 (pg. 19).  In this example, the variables $x_i$ are used to describe the ambient space of $S^2$, but we will use them as our local coordinates and instead use $(x, y, z)$ for the ambient space coordinates.  Given that $S^2 = \{(x,y,z) : x^2+y^2+z^2 = 1\}$ we have coordinates

$\displaystyle x_1 = \frac{x}{1+z} \text{ and } x_2 = \frac{y}{1+z}$

defined everywhere except the south pole $(0, 0, -1)$.  These local coordinates allow us to write

$T_pS^2 = \text{span} \left\{\frac{\partial}{\partial x_1}, \frac{\partial}{\partial x_2}\right\} \cong \mathbb{R}^2.$

For instance, the tangent vector $\vec{v} = \partial/\partial x_1 - 2\partial/\partial x_2$ should correspond to an “actual arrow” anchored at $p = (0, 0, 1)$ along which we could differentiate an arbitrary function $f:S^2\to\mathbb{R}$.   I claim that this tangent vector corresponds to the vector $(2,-4,0)$.  Is this correct?