WikiJournal Preprints/Cut the coordinates! (or Vector Analysis Done Fast)

From Wikiversity
Jump to navigation Jump to search

WikiJournal Preprints
Open access • Publication charge free • Public peer review

WikiJournal User Group is a publishing group of open-access, free-to-publish, Wikipedia-integrated academic journals. <seo title=" Wikiversity Journal User Group, WikiJournal Free to publish, Open access, Open-access, Non-profit, online journal, Public peer review "/>

<meta name='citation_doi' value=>

Article information

Author: Gavin R. Putland[i] 

See author information ▼

Abstract

The gradient, the curl, the divergence, and the Laplacian are initially defined, without coordinates, as closed-surface integrals per unit volume—the definition of the Laplacian being indifferent to whether the operand is a scalar field or a vector field. Four integral theorems—including the divergence theorem—follow almost immediately, provided that the initial definitions are unambiguous. Their unambiguity, together with their usefulness, is established as follows, at a level suitable for beginners (although this abstract is for prospective instructors):
  • The gradient is related to an acceleration through an equation of motion;
  • The divergence is related to two time-derivatives of density (the partial derivative and the material derivative) through two forms of an equation of continuity;
  • The component of the curl in a general direction is expressed as a divergence (now known to be unambiguous);
  • The same is done for the general component of the gradient, yielding not only a second proof of unambiguity of the gradient, but also the relation between the gradient and the directional derivative; this together with the original definition of the Laplacian shows that the Laplacian of a scalar field is the divergence of the gradient and therefore unambiguous. The unambiguity of the Laplacian of a vector field then follows from a component argument (as for the curl) or from a linearity argument.

The derivation of the relation between the gradient and the directional derivative yields a coordinate-free definition of the dot-del operator for a scalar right-hand operand. But, as the directional derivative is also defined for a non-scalar operand, the same relation offers a method of generalizing the dot-del operator, so that the definition of the Laplacian of a general field can be rewritten with that operator. The advection operator—derived without coordinates, for both scalar and vector properties—is likewise rewritten.

Meanwhile comparison between the definitions of the various operators leads to coordinate-free definitions of the del-cross, del-dot, and del-squared operators. These together with the dot-del operator allow the four integral theorems to be condensed into a single generalized volume-integral theorem.

If the volume of integration is reduced to a thin curved slab of uniform thickness, with an edge-face perpendicular to the broad faces, the four integral theorems are reduced to their two-dimensional forms, each of which relates an integral over a surface segment to an integral around its enclosing curve, provided that the original closed-surface integral has no contribution from the broad faces of the slab. This proviso can be satisfied by construction in two of the four cases, yielding two general theorems, one of which is the Kelvin–Stokes theorem. By applying these two theorems to a segment of a closed surface, and expanding the segment to cover the entire surface, it is shown that the gradient is irrotational and the curl is solenoidal.

The next part of the exposition is more conventional, but still coordinate-free. The gradient theorem is derived from the relation between the gradient and the directional derivative. An irrotational field is shown to have a scalar potential. The 1/r scalar field is shown to be the field whose negative gradient is the inverse-square vector field, whose divergence is a delta function, which is therefore also the negative Laplacian of the 1/r scalar field. These results enable the construction of a field with a given divergence or a given Laplacian. The wave equation is derived from small-amplitude sound waves in a non-viscous fluid, and shown to be satisfied by a spherical-wave field with a 1/r amplitude, whose D'Alembertian is a delta function, enabling the construction of a wave function with a given D'Alembertian. But further progress, including the construction of a field with a given curl, seems to require the introduction of coordinates.

With the aid of identities already found, expressions are easily obtained for the gradient, curl, divergence, Laplacian, and advection operators in Cartesian coordinates—with indicial notation and implicit summation, for brevity. While the resulting expressions for the curl and divergence may look unfamiliar, they match the initial definitions given by J. Willard Gibbs and are convenient for deriving further identities. A collection of identities is derived.

[To be continued.]


Introduction

[edit | edit source]

Sheldon Axler, in his essay "Down with determinants!" (1995) and his ensuing book Linear Algebra Done Right (4th Ed., 2023–), does not entirely eliminate determinants, but introduces them as late as possible and then exploits them for what he calls their "main reasonable use in undergraduate mathematics", namely the change-of-variables formula for multiple integrals.[1] Here I treat coordinates in vector analysis somewhat as Axler treats determinants in linear algebra: I introduce coordinates as late as possible, and then exploit them in unconventionally rigorous derivations of vector-analytic identities from vector-algebraic identities. But I contrast with Axler in at least two ways. First, as my subtitle suggests, I have no intention of expanding my paper into a book. Brevity is of the essence. Second, while one may well avoid determinants in numerical linear algebra,[2] one can hardly avoid coordinates in numerical vector analysis! So I cannot extend the coordinate-minimizing path into computation. But I can extend it up to the threshold by expressing the operators of vector analysis in a suitably general coordinate system, leaving others to specialize it and compute with it. On the way, I can satisfy readers who need the concepts of vector analysis for theoretical purposes, and who would rather read a paper than a book.

The cost of coordinates

[edit | edit source]

Mathematicians define a "vector" as a member of a vector space, which is a set whose members satisfy certain basic rules of algebra (called the vector-space axioms) with respect to another set (called a field), which has its own basic rules of algebra (the field axioms), and whose members are called "scalars". Physicists are more fussy. They typically want a "vector" to be not only a member of a vector space, but also a first-order tensor : a "tensor", meaning that it has an existence independent of any coordinate system with which it might be specified; and "first-order" (or "first-degree", or "first-rank"), meaning that it is specified by a one-dimensional array of numbers. Similarly, a 2nd-order tensor is specified by a 2-dimensional array (a matrix), and a 3rd-order by a 3-dimensional array, and so on; and a "scalar", being specified by a single number (a zero-dimensional array), is a zero-order tensor. In "vector analysis", we are greatly interested in applications to physical situations, and accordingly take the physicists' view on what constitutes a vector or a scalar.

So, for our purposes, defining a quantity by three components in (say) a Cartesian coordinate system is not enough to make it a vector, and defining a quantity as a real function of a list of coordinates is not enough to make it a scalar, because we still need to show that the quantity has an independent existence. One way to do this is to show that its coordinate representation behaves appropriately when the coordinate system is changed. (But don't worry if the following details look cryptic, because we won't be using them!) Independent existence of a quantity means that its coordinate representation is contravariant—that is, the representation changes so as to compensate for the change in the coordinate system.[a] But independent existence of an operator means that its coordinate representation is covariant—that is, the representation of the operator in the coordinate system, with the operand(s) and the result in that system, has the same form in one coordinate system as in another (except for features internal to the system).[b]

Here we circumvent these complications by the most obvious route: by initially defining things without coordinates. If, having defined something without coordinates, we then need to represent it with coordinates, we can choose the coordinate system for convenience.

The limitations of limits

[edit | edit source]

In the branch of pure mathematics known as analysis, there is a thing called a limit, whereby for every positive ϵ  there exists a positive δ such that if some increment is less than δ, some error is less than ϵ. In the branch of applied mathematics known as continuum mechanics, there is a thing called reality, whereby if the increment is less than some positive δ, the assumption of a continuum becomes ridiculous, so that the error cannot be made less than an arbitrary ϵ. Yet vector "analysis" (together with higher-order tensors) is typically studied with the intention of applying it to some form of "continuum" mechanics, such as the modeling of elasticity, plasticity, fluid flow, or (widening the net) electrodynamics of ordinary matter; in short, it is studied with the intention of conveniently forgetting that, on a sufficiently small scale, matter is lumpy.[c] One might therefore submit that to express the principles of vector analysis in the language of limits is to strain at a gnat and swallow a camel. Here I avoid that camel by referring to elements of length or area or volume, each of which is small enough to allow some quantity or quantities to be considered uniform within it, but, for the same reason, large enough to allow such local averaging of the said quantity or quantities as is necessary to tune out the lumpiness.

We shall see bigger camels, where well-known authors define or misdefine a vector operator and then want to treat it like an ordinary vector (a quantity). These I also avoid.

Prerequisites

[edit | edit source]

I assume that the reader is familiar with the algebra and geometry of vectors in 3D space, including the dot-product, the cross-product, and the scalar triple product, their geometric meanings, their expressions in Cartesian coordinates, and the identity

a × (b × c)  =  a⸱ c ba⸱b c ,

which we call the "expansion" of the vector triple product.[3] I further assume that the reader can generalize the concept of a derivative, so as to differentiate a vector with respect to a scalar, e.g.

or so as to differentiate a function of several independent variables "partially" w.r.t. one of them while the others are held constant, e.g.

But in view of the above remarks on limits, I also expect the reader to be tolerant of an argument like this: In a short time dt, let the vectors r and p change by and respectively. Then

where, as always, the orders of the cross-products matter.[d] Differentiation of a dot-product behaves similarly, except that the orders don't matter; and if  p = mv, where m is a scalar and v is a vector, then

Or an argument like this:  If, then

that is, we can switch the order of differentiation in a "mixed" partial derivative. Ifx is an abbreviation for /∂x, etc., this rule can be written in operational terms as

x y = ∂y x .

More generally, if i is an abbreviation for /∂xi where i = 1, 2,…, the rule becomes

i j = ∂j i .

These generalizations of differentiation, however, do not go beyond differentiation w.r.t. real variables, some of which are scalars, and some of which are coordinates. Vector analysis involves quantities that may be loosely described as derivatives w.r.t. a vector—usually the position vector.

Closed-surface integrals per unit volume

[edit | edit source]

The term field, mentioned above in the context of algebraic axioms, has an alternative meaning: if r is the position vector, a scalar field is a scalar-valued function of r, and a vector field is a vector-valued function of r; both may also depend on time. These are the functions of which we want "derivatives" w.r.t. the vector r.

In this section I introduce four such derivatives—the gradient, the curl, the divergence, and the Laplacian —in a way that will seem unremarkable to those readers who aren't already familiar with them, but idiosyncratic to those who are. The gradient is commonly introduced in connection with a curve and its endpoints, the curl in connection with a surface segment and its enclosing curve, the divergence in connection with a volume and its enclosing surface, and the Laplacian as a composite of two of the above, initially applicable only to a scalar field. Here I introduce all four in connection with a volume and its enclosing surface; and I introduce the Laplacian as a concept in its own right, equally applicable to a scalar or vector field, and only later relate it to the others. My initial definitions of the gradient, the curl, and the Laplacian, although not novel, are usually thought to be more advanced than the common ones—in spite of being conceptually simpler, and in spite of being obvious variations on the same theme.

Instant integral theorems (with a caveat)

[edit | edit source]

Let V be a volume (3D region) enclosed by a surface S (a mathematical surface, not generally a physical barrier). Let n̂ be the unit normal vector at a general point on S, pointing out of V. Let n be the distance from S in the direction of n̂ (positive outside V, negative inside), and let n be an abbreviation for /∂n, where the derivative—commonly called the normal derivative—is tacitly assumed to exist.

In V, and on S, let p be a scalar field (e.g., pressure in a fluid, or temperature), and let q be a vector field (e.g., flow velocity, or heat-flow density), and let ψ be a generic field which may be a scalar or a vector. Let a general element of the surface S have area dS, and let it be small enough to allow n̂, p, q, and n ψ to be considered uniform over the element. Then, for every element, the following four products are well defined:

 

 

 

 

(1)

If p is pressure in a non-viscous fluid, the first of these products is the force exerted by the fluid in V  through the area dS. The second product does not have such an obvious physical interpretation; but if q is circulating clockwise about an axis directed through V, the cross-product will be exactly tangential to S and will tend to have a component in the direction of that axis. The third product is the flux of q through the surface element; if q is flow velocity, the third product is the volumetric flow rate (volume per unit time) out of V  through dS ; or if q is heat-flow density, the third product is the heat transfer rate (energy per unit time) out of V  through dS. The fourth product, by analogy with the third, might be called the flux of the normal derivative of ψ through the surface element, but is equally well defined whether ψ is a scalar or a vector—or, for that matter, a matrix, or a tensor of any order, or anything else that we can differentiate w.r.t. n.

If we add up each of the four products over all the elements of the surface S, we obtain, respectively, the four surface integrals

 

 

 

 

(2)

in which the double integral sign indicates that the range of integration is two-dimensional. The first surface integral takes a scalar field and yields a vector; the second takes a vector field and yields a vector; the third takes a vector field and yields a scalar; and the fourth takes (e.g.) a scalar field yielding a scalar, or a vector field yielding a vector. If p is pressure in a non-viscous fluid, the first integral is the force exerted by the fluid in V  on the fluid outside V. The second integral may be called the skew surface integral of q over S ,[4] or, for the reason hinted above, the circulation of q over S.  The third integral, commonly called the flux integral (or simply the surface integral) of q over S, is the total flux of q out of V. And the fourth integral is the surface integral of the outward normal derivative of ψ.

Let the volume V  be divided into elements. Let a general volume element have the volume dV and be enclosed by the surface δS —not to be confused with the area dS of a surface element, which may be an element of S or of δS. Now consider what happens if, instead of evaluating each of the above surface integrals over S, we evaluate it over each δS and add up the results for all the volume elements. In the interior of V, each surface element of area dS is on the boundary between two volume elements, for which the unit normals n̂ at dS, and the respective values of n ψ, are equal and opposite. Hence when we add up the integrals over the surfaces δS, the contributions from the elements dS cancel in pairs, except on the original surface S, so that we are left with the original integral over S. So, for the four surface integrals in (2), we have respectively

 

 

 

 

(3)

Now comes a big "if":  if  we define the gradient of p (pronounced "grad p") as

 

 

 

 

(4g)

and the curl of q as

 

 

 

 

(4c)

and the divergence of q as

 

 

 

 

(4d)

and the Laplacian of ψ as [e]

 

 

 

 

(4L)

(where the letters after the equation number stand for gradient, curl, divergence, and Laplacian, respectively), then equations (3) can be rewritten

But because each term in each sum has a factor dV, we call the sum an integral; and because the range of integration is three-dimensional, we use a triple integral sign. Thus we obtain the following four theorems relating integrals over an enclosing surface S  to integrals over the enclosed volume V :

 

 

 

 

(5g)

 

 

 

 

(5c)

 

 

 

 

(5d)

 

 

 

 

(5L)

Of the above four results, only the third (5d) seems to have a standard name; it is called the divergence theorem (or Gauss's theorem or, more properly, Ostrogradsky's theorem[5]), and is indeed the best known of the four—although the other three, having been derived in parallel with it, may be said to be equally fundamental.

As each of the operators ∇, curl, and div calls for an integration w.r.t. area and then a division by volume, the dimension (or unit of measurement) of the result is the dimension of the operand divided by the dimension of length, as if the operation were some sort of differentiation w.r.t. position. Moreover, in each of equations (5g) to (5d), there is a triple integral on the right but only a double integral on the left, so that each of the operators ∇, curl, and div appears to compensate for a single integration. For these reasons, and for convenience, we shall describe them as differential operators. By comparison, the  operator in (4L) or (5L) calls for a further differentiation w.r.t. n ; we shall therefore describe as a 2nd-order differential operator. (Another reason for these descriptions will emerge in due course.) As promised, the four definitions (4g) to (4L) are "obvious variations on the same theme" (although the fourth is somewhat less obvious than the others).

But remember the "if": Theorems (5g) to (5L) depend on definitions (4g) to (4L) and are therefore only as definite as those definitions! Equations (3), without assuming anything about the shapes and sizes of the closed surfaces δS (except, tacitly, that n̂ is piecewise well-defined), indicate that the surface integrals are additive with respect to volume. But this additivity, by itself, does not guarantee that the surface integrals are shared among neighboring volume elements in proportion to their volumes, as envisaged by "definitions" (4g) to (4L). Each of these "definitions" is unambiguous if, and only if, the ratio of the surface integral to dV  is insensitive to the shape and size of δS for a sufficiently small δS. Notice that the issue here is not whether the ratios specified in equations (4g) to (4L) are true vectors or scalars, independent of the coordinates; all of the operations needed in those equations have coordinate-free definitions. Rather, the issue is whether the resulting ratios are unambiguous notwithstanding the ambiguity of δS, provided only that δS is sufficiently small. That is the advertised "caveat", which must now be addressed.

In accordance with our "applied" mathematical purpose, our proofs of the unambiguity of the differential operators will rest on a few thought experiments, each of which applies an operator to a physical field, say f, and obtains another physical field whose unambiguity is beyond dispute. The conclusion of the thought experiment is then applicable to any operand field whose mathematical properties are consistent with its interpretation as the physical field f ; the loss of generality, if any, is only what is incurred by that interpretation.

Unambiguity of the gradient

[edit | edit source]

Suppose that a fluid with density ρ (a scalar field) flows with velocity v (a vector field) under the influence of the internal pressure p (a scalar field). Then the integral in (4g) is the force exerted by the pressure of the fluid inside δS on the fluid outside, so that minus the integral is the force exerted on the fluid inside δS by the pressure of the fluid outside. Dividing by dV, we find that −∇p, as defined by (4g), is the force per unit volume, due to the pressure outside the volume.[6] If this is the only force per unit volume acting on the volume (e.g., because the fluid is non-viscous and in a weightless environment, and the volume element is not in contact with the container), then it is equal to the acceleration times the mass per unit volume; that is,

 

 

 

 

(6g)

Now provided that the left side of this equation is locally continuous, it can be considered uniform inside the small δS, so that the left side is unambiguous, whence  p is also unambiguous. If there are additional forces on the fluid element, e.g. due to gravity and/or viscosity, then −∇p is not the sole contribution to density-times-acceleration, but is still the contribution due to pressure, which is still unambiguous.

By showing the unambiguity of definition (4g), we have confirmed theorem (5g). In the process we have seen that the volume-based definition of the gradient is useful for the modeling of fluids, and intuitive in that it formalizes the common notion that a pressure "gradient" gives rise to a force.

Unambiguity of the divergence

[edit | edit source]

In the aforesaid fluid, in a short time dt, the volume that flows out of fixed closed surface δS  through a fixed surface element of area dS  is vdt⸱ n̂ dS.  Multiplying by density and integrating over δS, we find that the mass flowing out of δS  in time dt is  .  Dividing this by dV, and then by dt, we get the rate of reduction of density inside δS ; that is,

where the derivative w.r.t. time is evaluated at a fixed location (because δS is fixed), and is therefore written as a partial derivative (because other variables on which ρ might depend—namely the coordinates—are held constant). Provided that the right-hand side is locally continuous, it can be considered uniform inside δS and is therefore unambiguous, so that the left side is likewise unambiguous. But the left side is simply div ρv  as defined by (4d),[f] which is therefore also unambiguous,[7] confirming theorem (5d). In short, the divergence operator is that which maps ρv to the rate of reduction of density at a fixed point:

 

 

 

 

(7d)

This result, which expresses conservation of mass, is a form of the so-called equation of continuity.

The partial derivative ∂ρ/∂t in (7d) must be distinguished from the material derivative /dt, which is evaluated at a point that moves with the fluid.[g] [Similarly, dv/dt in (6g) is the material acceleration, because it is the acceleration of the mobile mass—not of a fixed point! ]  To re-derive the equation of continuity in terms of the material derivative, the volume vdt⸱ n̂ dS , which flows out through dS in time dt (as above), is integrated over δS to obtain the increase in volume of the mass initially contained in dV. Dividing this by the mass, ρ dV, gives the increase in specific volume (1/ρ) of that mass, and then dividing by dt gives the rate of change of specific volume; that is,

Multiplying by ρ² and comparing the left side with (4d), we obtain

 

 

 

 

(7d')

Whereas (7d) shows that div ρv is unambiguous, (7d') shows that div v is unambiguous (provided that the right-hand sides are locally continuous). In accordance with the everyday meaning of "divergence", (7d') also shows that div v is positive if the fluid is expanding (ρ decreasing), negative if it is contracting (ρ increasing), and zero if it is incompressible. In the last case, the equation of continuity reduces to

[ for an incompressible fluid ].

 

 

 

 

(7i)

For incompressible flow, any tubular surface tangential to the flow velocity, and consequently with no flow in or out of the "tube", has the same volumetric flow rate across all cross-sections of the "tube", as if the surface were the wall of a pipe full of liquid (except that the surface is not necessarily stationary). Accordingly, a vector field with zero divergence is described as solenoidal (from the Greek word for "pipe"). More generally, a solenoidal vector field has the property that for any tubular surface tangential to the field, the flux integrals across any two cross-sections of the "tube" are the same—because otherwise there would be a net flux integral out of the closed surface comprising the two cross-sections and any segment of tube between them, in which case, by the divergence theorem (5d), the divergence would have to be non-zero somewhere inside, contrary to (7i).

Unambiguity of the curl (and gradient)

[edit | edit source]

The unambiguity of the curl (4c) follows from the unambiguity of the divergence. Taking dot-products of (4c) with an arbitrary constant vector b, we get

that is, by (4d),

[ for uniform b].

 

 

 

 

(8c)

(The parentheses on the right, although helpful because of the spacing, are not strictly necessary, because the alternative binding would be (div q), which is a scalar, whose cross-product with the vector b is not defined. And the left-hand expression does not need parentheses, because it can only mean the dot-product of a curl with the vector b; it cannot mean the curl of a dot-product, because the curl of a scalar field is not defined.) This result (8c) is an identity if the vector b is independent of location, so that it can be taken inside or outside the surface integral; thus b may be a uniform vector field, and may be time-dependent. If we make b a unit vector, the left side of the identity is the (scalar) component of curl q in the direction of b, and the right side is unambiguous. Thus the curl is unambiguous because its component in any direction is unambiguous. This confirms theorem (5c).

Similarly, the unambiguity of the divergence implies the unambiguity of the gradient. Starting with (4g), taking dot-products with an arbitrary uniform vector b, and proceeding as above, we obtain

[ for uniform b].

 

 

 

 

(8g)

(The left-hand side does not need parentheses, because it can only mean the dot-product of a gradient with the vector b; it cannot mean the gradient of the dot-product of a scalar field with a vector field, because that dot-product would not be defined.) If we make b a unit vector, this result (8g) says that the (scalar) component of p in the direction of b is given by the right-hand side, which again is unambiguous. So here we have a second explanation of the unambiguity of the gradient: like the curl, it is unambiguous because its component in any direction is unambiguous.

We might well ask what happens if we take cross-products with b on the left, instead of dot-products. If we start with (4g), the process is straightforward: in the end we can switch the order of the cross-product on the left, and change the sign on the right, obtaining

[ for uniform b].

 

 

 

 

(8p)

(Again no parentheses are needed.) If we start with (4c) instead, and take b inside the integral, we get a vector triple product to expand, which leads to

in which the first term on the right is simply  ∇ b⸱q  (the gradient of the dot-product). The second term is more problematic. If we had a scalar p instead of the vector q, we could take b outside the second integral, so that the second term would be (minus) b ⸱ ∇p. This suggests that the actual second term should be (minus) b ⸱ ∇q.  Shall we therefore adopt the second term (without the sign) as the definition of b⸱∇ q for a vector q (treating b⸱ as an operator), and write

[ for uniform b] ?

 

 

 

 

(8q)

The proposal would be open to the objection that  b⸱∇ q  had been defined only for uniform b , whereas  b ⸱ ∇p (for scalar p) is defined whether b is uniform or not.  So, for the moment, let us put (8q) aside and run with (8c), (8g), and (8p).

Another meaning of the gradient

[edit | edit source]

Let ŝ be a unit vector in a given direction, and let s be a parameter measuring distance (arc length) along a path in that direction. By equation (8g) and definition (4d), we have

where, by the unambiguity of the divergence, the shape of the closed surface δS enclosing dV  can be chosen for convenience. So let δS be a right cylinder with cross-sectional area α  and perpendicular height ds , with the path passing perpendicularly through the end-faces at parameter-values s and s+ds , where the outward unit normal n̂ consequently takes the values ŝ and ŝ , respectively. And let the cross-sectional dimensions be small compared with ds  so that the values of p at the end-faces, say p and p+dp, can be taken to be the same as where the end-faces cut the path. Then  dV = α ds , and the surface integral over δS includes only the contributions from the end-faces (because n̂ is perpendicular to ŝ elsewhere); those contributions are respectively  and    i.e.   and .  With these substitutions the above equation becomes

that is,

 

 

 

 

(9g)

where the right-hand side, commonly called the directional derivative of p in the ŝ direction,[8] is the derivative of p w.r.t. distance in that direction. Although (9g) has been obtained by taking that direction as fixed, the equality is evidently maintained if s measures arc length along any path tangential to ŝ at the point of interest.

Equation (9g) is an alternative definition of the gradient: it says that the gradient of is the vector whose scalar component in any direction is the directional derivative of in that direction. For real, this component has its maximum, namely |p| , in the direction of p; thus the gradient of is the vector whose direction is that in which the derivative of w.r.t. distance is a maximum, and whose magnitude is that maximum. This is the usual conceptual definition of the gradient.[9] Sometimes it is convenient to work directly from this definition. For example, in Cartesian coordinates (x, y, z), if a scalar field is given by x , its gradient is obviously the unit vector in the direction of the x axis, usually called i; that is, x = i. Similarly, if  r = rr̂  is the position vector, then r = r̂.

If  ŝ is tangential to a level surface of p (a surface of constant p), then s p  in that direction is zero, in which case (9g) says that p (if not zero) is orthogonal to ŝ.  So is orthogonal to the surfaces of constant (as we would expect, having just shown that the direction of p is that in which p varies most steeply).

If p is uniform —that is, if it has no spatial variation—then its derivative w.r.t. distance in every direction is zero; that is, the component of p in every direction is zero, so that p must be the zero vector. In short, the gradient of a uniform scalar field is zero. Conversely, if p is not uniform, there must be some location and some direction in which its derivative w.r.t. distance, if defined at all, is non-zero, so that its gradient, if defined at all, is also non-zero. Thus a scalar field with zero gradient in some region is uniform in that region.

Unambiguity of the Laplacian

[edit | edit source]

Armed with our new definition of the gradient (9g), we can revisit our definition of the Laplacian (4L). If ψ is a scalar field, then, by (9g),  can be replaced by in (4L), which then becomes

 

 

 

 

(9L)

that is, by definition (4d),

[ for scalar ψ].

 

 

 

 

(9L')

So the Laplacian of a scalar field is the divergence of the gradient. This is the usual introductory definition of the Laplacian—and on its face is applicable only in the case of a scalar field. The unambiguity of the Laplacian, in this case, follows from the unambiguity of the divergence and the gradient.

If, on the contrary, ψ in definition (4L) is a vector field, then we can again take dot-products with a uniform vector b, obtaining

If we make b a unit vector, this says that the scalar component of the Laplacian of a vector field, in any direction, is the Laplacian of the scalar component of that vector field in that direction. As we have just established that the latter is unambiguous, so is the former.

But the unambiguity of the Laplacian can be generalized further. If

where each is a scalar field, and each αi is a constant, and the counter i ranges from (say) 1 to k , then it is clear from (4L) that

 

 

 

 

(10)

In words, this says that the Laplacian of a linear combination of fields is the same linear combination of the Laplacians of the same fields—or, more concisely, that the Laplacian is linear. I say "it is clear" because the Laplacian as defined by (4L) is itself a linear combination, so that (10) merely asserts that we can regroup the terms of a nested linear combination; the gradient, curl, and divergence as defined by (4g) to (4d) are likewise linear. It follows from (10) that the Laplacian of a linear combination of fields is unambiguous if the Laplacians of the separate fields are unambiguous. Now we have supposed that the fields are scalar and that the coefficients αi are constants. But the same logic applies if the "constants" are uniform basis vectors (e.g., i, j,k), so that the "linear combination" can represent any vector field, whence the Laplacian of any vector field is unambiguous. And the same logic applies if the "constants" are chosen as a "basis" for a space of tensors of any order, so that the Laplacian of any tensor field of that order is unambiguous, and so on. In short, the Laplacian of any field that we can express with a uniform basis is unambiguous.

The dot-del, del-cross, and del-dot operators

[edit | edit source]

The gradient operator is also called del.[h] If it simply denotes the gradient, we tend to pronounce it "grad" in order to emphasize the result. But it can also appear in combination with other operators to give other results, and in those contexts we tend to pronounce it "del".

One such combination is "dot del"— as in "b⸱∇ ", which we proposed for (8q), but did not quite manage to define satisfactorily for a vector operand. With our new definition of the gradient (9g), we can now make a second attempt. A general vector field q can be written |q| q̂ , so that

If ψ is a scalar field, we can apply (9g) to the right-hand side, obtaining

where sq is distance in the direction of q. For scalar ψ, this result is an identity between previously defined quantities. For non-scalar ψ, we have not yet defined the left-hand side, but the right-hand side is still well-defined and self-explanatory (provided that we can differentiate ψ w.r.t. sq). So we are free to adopt

 

 

 

 

(11)

where sq is distance in the direction of q , as the general definition of the operator q⸱∇ , and to interpret it as defining both a unary operator  q⸱ which operates on a generic field, and a binary operator  which takes a (possibly uniform) vector field on the left and a generic field on the right.

For any vector field q , it follows from (11) that if is a uniform field, then.

For the special case in which q is a unit vector ŝ , with s measuring distance in the direction of  ŝ , definition (11) reduces to

 

 

 

 

(12)

which agrees with (9g) but now holds for a generic field ψ [whereas (9g) was for a scalar field, and was derived as a theorem based on earlier definitions]. So ŝ∇ , with a unit vector s , is the directional-derivative operator on a generic field.

In particular, if  ŝ = n̂  we have

which we may substitute into the original definition of the Laplacian (4L) to obtain

 

 

 

 

(13L)

which is just (9L) again, except that it now holds for for a generic field.

If our general definition of the gradient (4g) is also taken as the general definition of the operator,[10] then, comparing (4g) with (4c), (4d), and (13L), we see that

where the parentheses may seem to be required on account of the closing dS in (4g). But if we write the factor dS before the integrand, the del operator in (4g) becomes

if  we insist that it is to be read as a operator looking for an operand, and not as a self-contained expression. Then, if we similarly bring forward the dS in (4c), (4d), and (13L), the respective operators become

 

 

 

 

(14)

(pronounced "del cross", "del dot", and "del dot del"), of which the last is usually abbreviated as2  ("del squared").[i] Because these operational equivalences follow from coordinate-free definitions, they must remain valid when correctly expressed in any coordinate system.[j] That does not mean that they are always convenient or always conducive to the avoidance of error—of which we shall have more to say in due course. But they sometimes make useful mnemonic devices. For example, they let us rewrite identities (8c), (8g), and (8p) as

for uniform b.

 

 

 

 

(15)

These would be basic algebraic vector identities if  were an ordinary vector, and one could try to derive them from the "algebraic" behavior of ; but they're not, because it isn't, so we didn't !  Moreover, these simple "algebraic" rules are for a uniform b, and do not of themselves tell us what to do if  b is spatially variable; for example, (8g) is not applicable to (7d).

The advection operator

[edit | edit source]

Variation or transportation of a property of a medium due to motion with the medium is called advection (which, according to its Latin roots, means "carrying to"). Suppose that a medium (possibly a fluid) moves with a velocity field v in some inertial reference frame. Let ψ be a field (possibly a scalar field or a vector field) expressing some property of the medium (e.g., density, or acceleration, or stress,[k]… or even v itself). We have seen that the time-derivative of ψ may be specified in two different ways: as the partial derivative ∂ψ/∂t , evaluated at a fixed point (in the chosen reference frame), or as the material derivative /dt, evaluated at a point moving at velocity v (i.e., with the medium). The difference  /dt − ∂ψ/∂t is due to motion with the medium. To find another expression for this difference, let s be a parameter measuring distance along the path traveled by a particle of the medium. Then, for a short time interval dt, the surface-plot of the small change in ψ (or each component thereof) as a function of the small changes in t and s  (plotted on perpendicular axes) can be taken as a plane through the origin, so that

that is, the change in ψ is the sum of the changes due to the change in t and the change in s . Dividing by dt gives

i.e.,

(and the first term on the right could have been written t ψ). So the second term on the right is the contribution to the material derivative due to motion with the medium; it is called the advective term, and is non-zero wherever a particle of the medium moves along a path on which ψ varies with location—even if ψ at each location is constant over time.  So the operator  |v| s , where s measures distance along the path, is the advection operator : it maps a property of a medium to the advective term in the time-derivative of that property. If ψ is v itself, the above result becomes

where the left-hand side (the material acceleration) is as given by Newton's second law, and the first term on the right (which we might call the "partial" acceleration) is the time-derivative of velocity in the chosen reference frame, and the second term on the right (the advective term) is the correction that must be added to the "partial" acceleration in order to obtain the material acceleration. This term is non-zero wherever velocity is non-zero and varies along a path, even if the velocity at each point on the path is constant over time (as when water speeds up while flowing at a constant volumetric rate into a nozzle). Paradoxically, while the material acceleration and the "partial" acceleration are apparently linear (first-degree) in v, their difference (the advective term) is not. Thus the distinction between ∂ψ/∂t and /dt  has the far-reaching implication that fluid dynamics is non-linear.

Applying (11) to the last two equations, we obtain respectively

 

 

 

 

(16)

and

 

 

 

 

(16v)

where, in each case, the second term on the right is the advective term. So the advection operator can also be written v⸱∇ .

When the generic ψ  in (16) is replaced by the density ρ , we get a relation between ∂ρ/∂t and /dt, both of which we have seen before—in equations (7d) and (7d') above. Substituting from those equations then gives

 

 

 

 

(17)

where ρ can be taken as a gradient since ρ is scalar. This result is in fact an identity—a product rule for the divergence—as we shall eventually confirm by another method.

Generalized volume-integral theorem

[edit | edit source]

We can rewrite the fourth integral theorem (5L) in the "dot del" notation as

 

 

 

 

(18L)

Then, using notations (14), we can condense all four integral theorems (5g), (5c), (5d), and (18L) into the single equation

 

 

 

 

(19)

where the "circ" symbol is a generic binary operator which may be replaced by a null (direct juxtaposition of the operands) for theorem (5g), or a cross for (5c), or a dot for (5d), or  for (18L). This single equation is a generalized volume-integral theorem, relating an integral over a volume to an integral over its enclosing surface.[11]

Theorem (19) is based on the following definitions, which have been found unambiguous:

  • the gradient of a scalar field p is the closed-surface integral of  n̂ p per unit volume, where n̂ is the outward unit normal;
  • the divergence of a vector field is the outward flux integral per unit volume;
  • the curl of a vector field is the skew surface integral per unit volume, also called the surface circulation per unit volume; and
  • the Laplacian is the closed-surface integral of the outward normal derivative, per unit volume.

The gradient maps a scalar field to a vector field; the divergence maps a vector field to a scalar field; the curl maps a vector field to a vector field; and the Laplacian maps a scalar field to a scalar field, or a vector field to a vector field, etc.

The gradient of p, as defined above, has been shown to be also

  • the vector whose (scalar) component in any direction is the directional derivative of p in that direction (i.e. the derivative of p w.r.t. distance in that direction), and
  • the vector whose direction is that in which the directional derivative of p is a maximum, and whose magnitude is that maximum.

Consistent with these alternative definitions of the gradient, we have defined the  operator so that  ŝ (for a unit vector ŝ) is the operator yielding the directional derivative in the direction of  ŝ , and we have used that notation to bring theorem (5L) under theorem (19).

So far, we have said comparatively little about the curl. That imbalance will now be rectified.

Closed-circuit integrals per unit area

[edit | edit source]

Instant integral theorems (on a condition)

[edit | edit source]

Theorems (5g) to (5L) are three-dimensional: each of them relates an integral over a volume V  to an integral over its enclosing surface S. We now seek analogous two-dimensional theorems, each of which relates an integral over a surface segment to an integral around its enclosing curve. For maximum generality, the surface segment should be allowed to be curved into a third dimension.[l] Theorems of this kind can be obtained as special cases of theorems (5g) to (5L) by suitably choosing V and S ; this is another advantage of our "volume first" approach.

Let Σ be a surface segment enclosed by a curve C (a circuit or closed contour), and let l be a parameter measuring arc length around C , so that a general element of C has length dl ; and let a general element of the surface Σ have area . Let be the unit normal vector at a general point on Σ , and let t ̂ be the unit tangent vector to C at a general point on C in the direction of increasing l. In the original case of a surface enclosing a volume, we had to decide whether the unit normal pointed into or out of the volume (we chose the latter). In the present case of a circuit enclosing a surface segment, we have to decide whether l is measured clockwise or counterclockwise as seen when looking in the direction of the unit normal, and we choose clockwise. So l is measured clockwise about and C is traversed clockwise about.

From Σ  we can construct obvious candidates for V and S. From every point on Σ , erect a perpendicular with a uniform small height h in the direction of. Then simply let V be the volume occupied by all the perpendiculars, and let S be its enclosing surface. Thus V is a (generally curved) thin slab of uniform thickness h, whose enclosing surface S consists of two close parallel (generally curved) broad faces connected by a perpendicular edge-face of uniform height h ; and we can treat as a vector field by extrapolating it perpendicularly from Σ. If we can arrange for h to cancel out, the volume V will serve as a 3D representation of the surface segment Σ while the edge-face will serve as a 2D representation of the curve C , so that our four theorems will relate an integral around C to an integral over Σprovided that there is no contribution from the broad faces to the integral over S. For brevity, let us call this proviso the 2D condition.

If  the 2D condition is satisfied, an integral over the new S reduces to an integral over the edge-face, on which

so that the cancellation of h will leave an integral over C  w.r.t. length. Meanwhile, in an integral over the new V, regardless of the 2D condition, we have

so that the cancellation of h will leave an integral over Σ w.r.t. area. So, substituting for dS and dV  in (5g) to (5L), and canceling h as planned, we obtain respectively

 

 

 

 

(20g)

 

 

 

 

(20c)

 

 

 

 

(20d)

 

 

 

 

(20L)

all subject to the 2D condition. In each equation, the circle on the left integral sign acknowledges that the integral is around a closed loop. The unit vector n̂ , which was normal to the edge-face, is now normal to both t ̂ and; that is, n̂ is tangential to the surface segment Σ and projects perpendicularly outward from its bounding curve.

On the left side of (20g), the 2D condition is satisfied if (but not only if) n̂p takes equal-and-opposite values at any two opposing points on opposing broad faces of S , i.e. if p takes the same value at such points, i.e. if p has a zero directional derivative normal to Σ , i.e. if p has no component normal to Σ. Thus a sufficient "2D condition" for (20g) is the obvious one.

Skipping forward to (20L), we see that the 2D condition is satisfied if takes equal-and-opposite values at any two opposing points on opposing broad faces of S , i.e. if (where measures distance in the direction of) takes the same value at such points, i.e. if.

For (20c) and (20d), the 2D constraint can be satisfied by construction, with more useful results—as explained under the next two headings. To facilitate this process, we first make a minor adjustment to Σ and C. Noting that any curved surface segment can be approximated to any desired accuracy by a polyhedral surface enclosed by a polygon, we shall indeed consider Σ to be a polyhedral surface made up of small planar elements, being the area of a general element, and we shall indeed consider C to be a polygon with short sides, dl being the length of a general side.[m] The benefit of this trick, as we shall see, is to make the unit normal uniform over each surface element, without forcing us to treat q (or any other field) as uniform over the same element. But, as the elements of C can independently be made as short as we like (dividing straight sides into shorter elements if necessary!), we can still consider q , and t ̂ to be uniform over each element of C.

Special case for the gradient

[edit | edit source]

In (20c), the 2D condition is satisfied by (where p is a scalar field), because then the integrand on the left is zero on the broad faces of S , where n is parallel to. Equation (20c) then becomes

 

 

 

 

(21n)

Now on the left,  and on the right, over each surface element, the unit normal is uniform so that, by (8p),  .  With these substitutions, the minus signs cancel and we get

 

 

 

 

(21g)

or, if we write  and 

 

 

 

 

(21r)

This result, although well attested in the literature,[12] does not seem to have a name—unlike the next result.

Special case for the curl

[edit | edit source]

In (20d), the 2D condition is satisfied if q is replaced by because then (again) the integrand on the left is zero on the broad faces of S , where n is parallel to. Equation (20d) then becomes

 

 

 

 

(22n)

Now on the left, the integrand can be written  and on the right,  by identity (8c), since is uniform over each surface element.  With these substitutions, the minus signs cancel and we get

 

 

 

 

(22c)

or, if we again write  and 

 

 

 

 

(22r)

This result—the best-known theorem relating an integral over a surface segment to an integral around its enclosing curve, and the best-known theorem involving the curl—is called Stokes' theorem or, more properly, the Kelvin–Stokes theorem,[13] or simply the curl theorem.[n]

The integral on the left of (22c) or (22r) is called the circulation of the vector field q around the closed curve C. So, in words, the Kelvin–Stokes theorem says that the circulation of a vector field around a closed curve is equal to the flux of the curl of that vector field through any surface spanning that closed curve.

Now let a general element of Σ (with area dΣ ) be enclosed by the curve δC, traversed in the same direction as the outer curve C. Then, applying (22c) to the single element, we have

that is,

 

 

 

 

(23c)

where the right-hand side is simply the circulation per unit area.

Equation (23c) is an alternative definition of the curl: it says that the curl of q is the vector whose scalar component in any direction is the circulation of q per unit area of a surface whose normal points in that direction. For real q, this component has its maximum, namely |curl q| , in the direction of curl q; thus the curl of q is the vector whose direction is that which a surface must face if the circulation of q per unit area of that surface is to be a maximum, and whose magnitude is that maximum. This is the usual conceptual definition of the curl.[14]

[Notice, however, that our original volume-based definition (4c) is more succinct: the curl is the closed-surface circulation per unit volume, i.e. the skew surface integral per unit volume.]

It should now be clear where the curl gets its name (coined by Maxwell), and why it is also called the rotation (indeed the curl operator is sometimes written "rot", especially in Continental languages, in which "rot" does not have the same unfortunate everyday meaning as in English). It should be similarly unsurprising that a vector field with zero curl is described as irrotational (which one must carefully pronounce differently from "irri tational"!), and that the curl of the velocity of a fluid is called the vorticity.

However, a field does not need to be vortex-like in order to have a non-zero curl; for example, by identity (8p), in Cartesian coordinates, the velocity field xj has a curl equal to  x × j = i × j = k ,  although it describes a shearing motion rather than a rotating motion. This is understandable because if you hold a pencil between the palms of your hands and slide one palm over the other (a shearing motion), the pencil rotates. Conversely, we can have a vortex-like field whose curl is zero everywhere except on or near the axis of the vortex. For example, the Maxwell–Ampère law in magnetostatics says that  curl H = J , where H is the magnetizing field and J is the current density.[o] So if the current is confined to a wire, curl H is zero outside the wire—although, as is well known, the field lines circle the wire. The resolution of the paradox is that H gets stronger as we approach the wire, making a shearing pattern, whose effect on the curl counteracts that of the rotation.

The curl-grad and div-curl operators

[edit | edit source]

We have seen from (9L) that the Laplacian of a scalar field is the divergence of the gradient. Four more such second-order combinations make sense, namely the curl of the gradient (of a scalar field), and the divergence of the curl, the gradient of the divergence, and the curl of the curl (of a vector field). The first two —"curl grad" and "div curl"— can now be disposed of.

Let the surface segment Σ enclosed by the curve C be a segment of the closed surface S surrounding the volume V, and let Σ expand across S until it engulfs S , so that C shrinks to a point on the far side of S. Then, in the nameless theorem (21g) and the Kelvin–Stokes theorem (22c), the integral on the left becomes zero while Σ and on the right become S and n̂ , so that the theorems respectively reduce to

and

Applying theorem (5c) to the first of these two equations, and the divergence theorem (5d) to the second, we obtain respectively

and

As the integrals vanish for any volume V in which the integrands are defined, the integrands must be zero wherever they are defined; that is,

 

 

 

 

(24c)

and

 

 

 

 

(24d)

In words, the curl of the gradient is zero, and the divergence of the curl is zero; or, more concisely, any gradient is irrotational, and any curl is solenoidal.

We might well ask whether the converses are true. Is every irrotational vector field the gradient of something? And is every solenoidal vector field the curl of something? The answers are affirmative, but the proofs require more preparation.

Meanwhile we may note, as a mnemonic aid, that when the left-hand sides of the last two equations are rewritten in the del-cross and del-dot notations, they become  ∇ × ∇p  and  ∇  ∇ × q , respectively. The former looks like (but isn't) a cross-product of two parallel vectors, and the latter looks like (but isn't) a scalar triple product with a repeated factor, so that each expression looks like it ought to be zero (and it is). But such appearances can lead one astray, because is an operator, not a self-contained vector quantity; for example,  p × ∇φ  is not identically zero, because two gradients are not necessarily parallel.[15]

We should also note, to tie a loose end, that identity (24d) was to be expected from our verbal statement of the Kelvin–Stokes theorem (22c). That statement implies that the flux of the curl through any two surfaces spanning the same closed curve is the same. So if we make a closed surface from two spanning surfaces, the flux into one spanning surface is equal to the flux out of the other, i.e. the net flux out of the closed surface is zero, i.e. the integral of the divergence over the enclosed volume is zero; and since any simple volume in which the divergence is defined can be enclosed this way, the divergence itself (of the curl) must be zero wherever it is defined.

Change per unit length

[edit | edit source]

Continuing (and concluding) the trend of reducing the number of dimensions, we now seek one-dimensional theorems, each of which relates an integral over a path to values at the endpoints of the path. For maximum generality, the path should be allowed to be curved into a second and a third dimension.

We could do this by further specializing theorems (5g) to (5L). We could take a curve Γ with a unit tangent vector ŝ. At every point on Γ we could mount a circular disk with a uniform small area α , centered on Γ and orthogonal to it. We could let V be the volume occupied by all the disks and let S be its enclosing surface; thus V would be a thin right circular cylinder, except that its axis could be curved. If we could arrange for α to cancel out, our four theorems would indeed be reduced to the desired form, provided that there were no contribution from the curved face of the "cylinder" to the integral over S (the "1D proviso"). But, as it turns out, this exercise yields only one case in which the "1D proviso" can be satisfied by a construction involving ŝ and a general field, and we have already almost discovered that case by a simpler and more conventional argument—which we shall now continue.

Fundamental theorem

[edit | edit source]

Equation (9g) is applicable where p(r) is a scalar field,  s is a parameter measuring arc length along a curve Γ, and ŝ is the unit tangent vector to Γ in the direction of increasing s. Let s take the values s1 and s2 at the endpoints of Γ, where the position vector r takes the values r1 and r2 respectively. Then, integrating (9g) w.r.t. s from s1 to s2 and applying the fundamental theorem of calculus, we get

 

 

 

 

(25g)

This is our third integral theorem involving the gradient, and the best-known of the three: it is commonly called simply the gradient theorem,[p] or the fundamental theorem of the gradient, or the fundamental theorem of line integrals; it generalizes the fundamental theorem of calculus to a curved path.[16] If we write for  ŝ ds (the change in the position vector), we get the theorem in the alternative form

 

 

 

 

(25r)

As the right-hand side of (25g) or (25r) obviously depends on the endpoints but not on the path in between, so does the integral on the left. This integral is commonly called the work integral of p over the path—because if p is a force, the integral is the work done by the force over the path. So, in words, the gradient theorem says that the change in value of a scalar field from one point to another is the work integral of the gradient of that field field over any path from the one to the other.

Applying (25r) to a single element of the curve, we get

 

 

 

 

(26g)

Alternatively, we could have obtained (26g) by multiplying both sides of (9g) by ds, and then obtained (25r) by adding (26g) over all the elemental displacements on any path from r1 to r2.

If we close the path by setting  r2 = r1 , the gradient theorem reduces to

 

 

 

 

(27g)

where the integral is around any closed loop. Applying the Kelvin–Stokes theorem then gives

 

 

 

 

(28g)

where Σ is any surface spanning the loop. As this applies to any loop spanned by any surface on which the integrand is defined,  curl ∇p  must be zero wherever it is defined. This is a second proof (and indeed the usual method of proof) of theorem (24c).

Scalar potential: field with given gradient

[edit | edit source]

Lemma:  If  curl q = 0  in a simply connected region V,  then  over any path in V  depends only on the endpoints of the path.

Proof:  Suppose, on the contrary, that there are two paths Γ and Λ in V,  with a common starting point and a common finishing point, such that

Let  −Λ denote Λ traversed backwards. Then every on Λ is on  −Λ , so that we have

i.e.

where the left-hand side is now a work integral of q around a closed loop in V.  By the simple connectedness of V,  this loop is spanned by some surface Σ in V. So, applying the Kelvin–Stokes theorem to the above equation, we conclude that the flux integral of  curl q  through Σ  is non-zero, in which case  curl q  must be non-zero somewhere on Σ , hence somewhere in V — contradicting the hypothesis of the lemma. ◼

Corollary:  If  curl q = 0  in a simply connected region V,  there exists a scalar field p such that  q = ∇p  in V.

Proof:  We shall show that a suitable candidate is

where r0 is the position vector of any fixed point in V,  and ρ is the position vector of a general point on the path of integration, which may be any path in V. First note that p(r) is unambiguous because, by the preceding lemma, it is independent of the path for given r0 and r, provided that the path is in V.  Now to find  p(r),  let σ be the arc length along the path from r0 to ρ, so that σ ranges from 0 to (say) s  as ρ ranges from r0 to r; and let ŝ be the unit vector tangential to the path at ρ, in the direction of increasing σ.  Then  dρ = ŝ  , so that the above equation becomes

Differentiating w.r.t. s gives

where ŝ is evaluated at  σ = s  and is therefore in the direction in which the path reaches r.  By the generality of the path, this can be any direction. So the last equation says that q is the vector whose (scalar) component in any direction is the derivative of p w.r.t. arc length in that direction; that is, q = ∇p , as required. ◼

This is the promised converse of theorem (24c). But, given an irrotational vector field q , we usually prefer to find a scalar field whose negative gradient is q;  that is, we usually prefer a scalar field such that  .  Such a field is called a scalar potential for q.  From the above expression for p(r), a suitable candidate is

 

 

 

 

(29)

A scalar field has zero gradient if and only if it is uniform, so that adding a uniform field, but only a uniform field, to a given scalar field leaves its gradient unchanged. Thus the scalar potential is determined up to an arbitrary additive uniform field. This would be the case with or without the minus sign in front of the gradient. The reason for preferring the minus sign appears next.

Conservative fields

[edit | edit source]

An irrotational vector field—or, equivalently, a field that is (plus or minus) the gradient of something—is described as conservative, because if the field is a force, it does zero work around a closed loop, and consequently conserves energy around the loop (at least if the field does not change during traversal of the loop).

If the only force acting on a particle is  F = −∇U,  then, by the gradient theorem, the work done on the particle over a path is the increase in −U,  i.e. the decrease in U ; and this work is the increase in the particle's kinetic energy T.  Hence, if we identify U with the potential energy, the total energy  U + T  is conserved. This interpretation of the scalar potential is possible only if the force is minus the gradient of the potential.

The minus sign is also used if the conservative vector field is an electric field (force per unit charge) or a gravitational acceleration (force per unit mass); the scalar potential is potential energy per unit charge, or potential energy per unit mass, respectively.

Some special fields

[edit | edit source]

The 1/r scalar potential

[edit | edit source]

For the potential energy field

 

 

 

 

(30)

where r is the distance from the origin (and r ≠ 0), let us find the corresponding force  F = −∇U.  The direction of  U  is that of the steepest increase of U, which, by the spherical symmetry, can only be parallel or anti-parallel to r̂ (the unit vector pointing away from the origin). So

whence

 

 

 

 

(31)

So the negative gradient of the 1/r  scalar potential (30) is the unit inverse-square radial vector field. Multiplying the numerator and denominator by r gives the alternative form

which is convenient if the center of the force is shifted from the origin to position r′: in that case we simply replace r by r − r′, and r by |r − r′|, so that the force becomes

and the corresponding scalar potential becomes

Inverse-square radial vector field

[edit | edit source]

We derived the vector field (31) as the negative gradient of the scalar potential (30). Conversely, given the inverse-square radial vector field (31), we could derive its scalar potential from (29). At a general point on the path, let the position vector be  so that, by (31),  .  Then (29) becomes

so that, if we choose  r0 → ∞ , we recover (30).

Because F, given by (31), has a scalar potential,  curl F  must be zero. This is independently obvious in that the spherical symmetry ofF seems to rule out any resemblance of rotation or shear—even at the origin, where F becomes infinite. On the last point, let us check whether  curl F  has a meaningful integral over a volume containing the origin. If the volume V  is enclosed by the surface S  whose outward unit normal is n̂ , then, by theorem (5c),

If V contains the origin, then, because  curl F  is zero everywhere except at the origin, the volume V  can be replaced by any element of V  containing the origin, whatever the shape of that element may be. If we choose that element to be a spherical ball centered on the origin, then n̂ is parallel to r̂ , so that the cross-product in the integrand on the right is zero. Thus the volume integral on the left is not only meaningful, but is zero, even if the volume contains the point where the integrand is undefined. In this sense, the field F is so irrotational that its curl may be taken as zero even where the field itself is undefined!

The situation concerning the divergence ofF is more complicated. Again, let the volume V  be enclosed by the surface S whose outward unit normal is n̂.  By the divergence theorem (5d),

where dΩ is the solid angle subtended at the origin by the surface element of area dS , and is positive if the outward unit normal n̂ has a positive component away from the origin (r̂ ⸱ n̂ > 0), and negative if n̂ has a positive component toward the origin (r̂ ⸱ n̂ < 0). If the volume enclosed by S does not include the origin, then for every positive contribution dΩ there is a compensating negative contribution, so that the integral of  div F  over the volume is zero. As this applies to every such volume,  div F  must be zero everywhere except at the origin. If, on the contrary, the volume does include the origin, then the contributions dΩ add up to the total solid angle subtended by the enclosing surface, which is 4π. In summary,

 

 

 

 

(32d)

where δ(r), the 3D unit delta function, is zero everywhere except at the origin, but has an integral of  1 over any volume that includes the origin. For example, a unit point-mass at the origin has the density δ(r), and a point-mass m at position r′ has the density  (r − r′). As the argument of  div  in (32d) is  −∇(1/r), we also have

 

 

 

 

(32L)

If we shift the centers from the origin to r′, the last two results become

 

 

 

 

(33d)

and

 

 

 

 

(33L)

Field with given divergence (and zero curl)

[edit | edit source]

It follows from Coulomb's law that the electric field due to a point-charge Q at the origin, in a vacuum, is

where ϵ0 is a physical constant (called the vacuum permittivity or simply the electric constant). In a vacuum, the electric displacement field, denoted by D , is ϵ0E.  So it is convenient to multiply the above equation by ϵ0 , obtaining

This is a inverse-square radial vector field and therefore has zero curl.

Now suppose that, instead of a charge Q at the origin, we have a static charge density ρ(r′) in a general elemental volume dV′  at position r′ (the standard symbol for charge density being unfortunately the same as for mass density). Then the contribution from that element to the field D at position r  is

provided that, for each r, the dimensions of each volume element are small compared with |r − r′|. This contribution likewise has zero curl. The total field due to static charges is then the sum of the contributions:

 

 

 

 

(34)

where the integral is over all space. And D(r) has zero curl because all the contributions have zero curl.

Independently of the physical significance of  D(r), we can take its divergence "term by term" (or "under the integral sign"), obtaining

where the last step is permitted because the volume integral of the delta function of r′ is not changed by a "point reflection" (inversion) across r.  As the volume of integration (all space) includes the shifted origin of the delta function, the integral is simply 1 , so that

 

 

 

 

(35)

where both sides are evaluated at r.

Mathematically, this result is an identity which applies if  D is given by (34). It shows that we can construct an irrotational vector fieldD(r) whose divergence is a given scalar field ρ(r). And of course, by theorem (24d), any curl can be added to that vector field without changing its divergence.

In electrostatics, (34) is a generalization of Coulomb's law; and (35), which follows from (34), is Gauss's law expressed in differential form. If we integrate (35) over a volume enclosed by a surface S (with outward unit normal n̂) and apply the divergence theorem on the left, we get the integral form of Gauss's law:

 

 

 

 

(36)

where Qe is the total charge enclosed by S.

Field with given Laplacian

[edit | edit source]

As an identity, (35) can be written out in full by substituting from (34):

 

 

 

 

(37)

Recognizing the r-dependent factor  r − r′/|r − r′|3  as  −∇1/ |r − r′|   and taking the gradient operator outside the integral, we get

i.e.

 

 

 

 

(38)

This shows that we can construct a field whose Laplacian is a given field. More precisely, it shows that we can construct a scalar field whose Laplacian is a given scalar  field ρ(r). But, due to the linearity of the Laplacian, the same applies to any given linear combination of scalar fields, including any combination whose coefficients are uniform vectors, uniform matrices, or uniform tensors of any order; that is, the same applies to any field that we can express with a uniform basis.

Mathematically, (38) is simply an identity. To find its significance in electrostatics, we can multiply it by  −1/ϵ0 , obtaining

 

 

 

 

(39)

which is also an identity. But the negative gradient of the expression after the integral sign is

which is the contribution to the electric field at position r due to a charge  ρ(r′) dV′  at position r′ in a vacuum. So the expression after the integral sign is the corresponding contribution to the electrostatic potential, and the whole integral is the whole electrostatic potential. Denoting this by we can rewrite (39) as

 

 

 

 

(40)

This is Poisson's equation in electrostatics, treating the medium as a vacuum (so that ρ must be taken as the total charge density, including any contributions caused by the effect of the field on the medium). In a region in which  ρ = 0 ,  Poisson's equation (40) reduces to

 

 

 

 

(41)

which is Laplace's equation in electrostatics.

The wave equation

[edit | edit source]

It is an empirical fact that a compressible fluid, such as air, carries waves of a mechanical nature: sound waves. In establishing the unambiguity of the gradient and the divergence, we have already derived equations dealing with the inertia and continuity (mass-conservation) of non-viscous fluids. So, by introducing a relation describing the compressibility, and eliminating variables, we should be able to get one equation (the "wave equation") in one scalar or vector field (the "wave function"), with recognizably "wavelike" solutions. And we should expect this equation to be analogous to equations describing other kinds of waves.

If we suppose, for simplicity, that the only force acting on an element of fluid is the pressure force, the applicable equation of motion is (6g). But, for reasons which will soon be apparent, let us call the pressure P, so that (6g) becomes

Then at equilibrium we have

where P0 is the equilibrium pressure. Subtracting this equation from the previous one and defining

we get

which looks like (6g), except that p is now the sound pressure (also called "acoustic pressure", or sometimes "excess pressure"), i.e. the pressure rise above equilibrium.

For the equation of continuity we can use (7d'), which we repeat for convenience:

Eliminating v between the last two equations is fraught because v is evaluated at a moving point in the former and at a fixed point in the latter; and introducing any relation between p and ρ is similarly fraught because p is evaluated at a fixed point and ρ at a moving point. The obvious remedy is to apply the advection rule (16) to the last two equations, obtaining respectively

That gets all the variables evaluated at fixed points, at the cost of making the equations more complicated and more obviously non-linear. But the equations and be simplified and linearized by small-amplitude approximations. In the parentheses in the first equation, the first term is proportional to the amplitude of the vibrations while the second term is a product of two factors proportional to the amplitude, so that, for sufficiently small amplitudes, the second term is negligible. Similarly, in the second equation, for sufficiently small amplitudes and a homogeneous medium, we can neglect the second term on the right. Then, on the left side of each equation, we are left with a factor proportional to the amplitude, multiplied by ρ. But ρ is not proportional to the amplitude; only its deviation from the equilibrium density is so proportional. Hence, for small amplitudes,  ρ can be replaced by the equilibrium density, which we shall call ρ0 , which is independent of time and (in a homogeneous medium) independent of position. With these approximations, our equations of motion and continuity become

where, for brevity, we use an overdot to denote partial differentiation w.r.t. time (i.e., at a fixed point, not a point moving with the fluid).

Now we can eliminate v. Taking diverges in the first equation, and differentiating the second partially w.r.t. time (which can be done inside the div operator, which represents a linear combination), we get

so that we can equate the right-hand sides, obtaining

 

 

 

 

(42)

Maintaining the small-amplitude assumption, we can now consider compressibility. For small compressions in a homogeneous medium, we may suppose that the pressure change dp is some constant times the density change. It is readily verified that such a constant must have the dimension of velocity squared. So we can say  dp = c²  , where c is a constant with the units of velocity.[q] Dividing by dt gives  whence

 

 

 

 

(43)

Substituting from (42) then gives the desired wave equation:

 

 

 

 

(44)

This is the 3D classical wave equation with the sound pressure p as the wave function. For a generic wave function ψ , in a homogeneous isotropic medium, we would expect the equation to be

 

 

 

 

(45)

which may be written more compactly as

 

 

 

 

(46)

where ☐, pronounced "wave" or "box",[r] is called the D'Alembertian operator and is defined by

 

 

 

 

(47)

in this paper, although other conventions exist.[s]

In a static situation, the second term on the right is zero. So one advantage of definition (47), over any alternative definition that changes the sign or the scale factor, is that in the static case, the D'Alembertian is reduced to the Laplacian, making it especially obvious that in the static case, the wave equation is reduced to Laplace's equation [compare (46) and (41)]. Also notice that the D'Alembertian, being a linear combination of two linear operators, is itself linear.

Spherical waves

[edit | edit source]

Having established that there are wavelike time-dependent fields described by equation (45), in which the constant c has the units of velocity, we can now make an informed guess at an elementary solution of the equation. Consider the candidate

 

 

 

 

(48)

where  r = rr̂  is the position vector (so that r is distance from the origin),  f  is an arbitrary function (arbitrary except that we will need to be able to differentiate it twice),  t is time, and c is a constant (and obviously ψ is not defined at the origin even if f  is.)

If, at the origin, the function f  has a certain argument at time  then at any distance r  from the origin, it has the same argument at time  which is  later  than at the origin. Hence, if f  has a certain feature (e.g., a zero-crossing) at the origin, the time taken for that feature to reach any distance is implying that the feature travels outward from the origin at speed c.  Another way to perceive this is to set the argument of f  equal to a constant (corresponding to some feature of the function) and differentiate w.r.t. t , obtaining  (the speed at which the feature recedes from the origin). Thus equation (48) describes waves radiating outward from the origin with speed c. [t]

Equation (48) further implies that there are surfaces over which the wave function ψ  is uniform—namely surfaces of constant r,  i.e. spheres centered on the origin. These are the wavefronts. So (48) describes spherical waves.

Because the surface area of a sphere is proportional to the square of its radius, we should expect the radiated intensity (power per unit area) to satisfy an inverse-square law (if the medium is lossless—neither absorbing nor scattering the radiated power). That does not mean that the wave function itself should satisfy an inverse-square law. In a traveling wave in 3D space, there will be an "effort" variable (e.g., sound pressure) and a "flow" variable (e.g., fluid velocity), and the instantaneous intensity will be proportional to the product of the two. If the two are proportional to each other, the instantaneous intensity will be proportional to the square of one or the other. Hence if the instantaneous intensity falls off like 1/r 2, the effort and flow variables—and the wave function, if it is proportional to one or the other—will fall off like 1/r. That suggests the attenuation factor 1/r  in (48).

But there are big if s in that argument. For all we know so far, the relation between effort and flow could involve a lag, so that the instantaneous product of the two could swing negative although it averages to something positive. And for all we know so far, the lag could vary with r, allowing at least one of the two (effort or flow) to depart from the 1/r  law, even if their average product still falls off like 1/r 2. The 1/r  factor in (48) is therefore only an "informed guess". Notwithstanding these complications, we have also guessed that the form of the function f  (the "waveform") does not change as r increases; we have not considered whether this behavior might depend on the medium, or the functional form, or the geometry.

So let us carefully verify that (48) satisfies (45) or, equivalently, (46).

As a first step, and as a useful inquiry in its own right, we find ψ from definition (4L), given that ψ is a function of (r, t) only. For the surface δS  let us start with

  • a cone (not a double cone) with its apex at the origin, subtending a small solid angle ω at the origin,
  • a sphere centered on the origin, with radius r, and
  • a sphere centered on the origin, with radius r + dr ;

and let the volume element be the region inside the cone and between the spheres, so that its enclosing surface δS  has three faces: a segment of the cone, a segment of the inner sphere with area r 2ω , and a segment of the outer sphere with area (r + dr)2ω. By the symmetry of ψ , the outward normal derivative n ψ  is equal to zero on the conical face,  +r ψ(r + dr, t) on the outer spherical face, and  r ψ(r, t) on the inner spherical face. The volume of the element is  dV = r 2ω dr. So, assembling the pieces of definition (4L), we get

i.e.

 

 

 

 

(49)

Now we can verify our "informed guess". Differentiating (48) twice w.r.t. t  by the chain rule gives

 

 

 

 

(50)

where each prime (′) denotes differentiation of the function w.r.t. its own argument. Differentiating (48) once w.r.t. r  by the product rule and chain rule, we get

 

 

 

 

(51)

Proceeding as specified in (49), we multiply this by r 2, differentiate again w.r.t. r (obtaining three terms, of which two cancel), and divide by r 2, and get

 

 

 

 

(52)

Then if we substitute (52) and (50) into (47), we obviously get  ψ = 0 , satisfying (46). So we have guessed correctly.

Having shown that the D'Alembertian of ψ , as given by (48), is zero everywhere except at the origin (where it is not defined), let us now find its integral over a volume V (enclosed by a surface S) that includes the origin. From (47),

where the second equality follows from theorem (5L). Now because the integrand on the left is zero except at the origin, any V containing the origin will give the same integral. So for convenience, let V be a spherical ball of radius R centered on the origin. Then, by the spherical symmetry of ψ , integration over S reduces to multiplication by 4πR2, and n is equivalent tor , and dV can be taken as 4πr 2dr. With these substitutions we have

or, substituting from (51) and (50),

Again noting that any V containing the origin will give the same integral, we can let R approach zero, with the result that the integral approaches  −4πf (t). This is the integral ofψ over any volume containing the origin, for ψ given by (48). Meanwhile ψ is zero everywhere except that the origin. In summary,

 

 

 

 

(53)

Shifting the center of the spherical waves from the origin to position r′, we get

 

 

 

 

(54)

We shall refer to the field given by (48) as the wave function due to a monopole source with strength f (t) at the origin. The D'Alembertian of this wave function is given by (53).[17] Hence the field whose D'Alembertian is given by (54) is the wave function due to a monopole source with strength f (t) at position r′. In each case, the D'Alembertian is zero everywhere except at the source.

Field with given D'Alembertian

[edit | edit source]

Now suppose that, instead of a wave source with strength f (t) at the general position r′, we have at that position a wave-source density in an elemental volume dV′, whose contribution to the wave function ψ at position r  is

where for each r, the dimensions of each volume element are small compared with |r − r′|. Then the total wave function is the sum of the contributions:

 

 

 

 

(55)

where the integral is over all space.

Independently of the physical significance of ψ(r, t), we can take its D'Alembertian "under the integral sign" by rule (54), obtaining

that is,

 

 

 

 

(56)

Mathematically, equation (56) is an identity which applies if ψ(r, t) is given by (55). Substituting from (55) and solving for we can write the identity in full as

 

 

 

 

(57)

which shows that we can construct a wave function with a given D'Alembertian.

Physically, equation (56) gives the D'Alembertian of the wave function for a source density. It is the inhomogeneous wave equation, which applies in the presence of an arbitrary source density—in contrast to the homogeneous wave equation (46), which applies in a region where the source density is zero. In this context the word homogeneous or inhomogeneous describes the equation, not the medium (which has been assumed homogeneous and isotropic).

In a static situation, in which the D'Alembertian is reduced to the Laplacian, the inhomogeneous wave equation (56) is reduced to the form of Poisson's equation (40). As written, equation (40) is Poisson's equation in electrostatics; it applies to the charge density ρ(r), for which the scalar potential [in (39)] is

In electrodynamics, which takes time-dependence into account, the scalar potential due to the charge density ρ(r, t) is

where the wave speed c is the speed of light; this is the same as in the static case except for the delay  |r − r′| /c , indicating that the influence of the change density at r′ travels outward from that point at the speed of light. In the dynamic case, by rule (57), the D'Alembertian of the scalar potential is

This result is the inhomogeneous wave equation in the scalar potential—the equation which, in the electrostatic case, reduces to Poisson's equation (40).

In electrodynamics, however, the electric field  E is not simply but where A is the magnetic vector potential, whose defining property is that its curl is the magnetic flux density:

By identity (24d), this property implies

which is Gauss's law for magnetism. We have noted in passing—but not yet proven—that (24d) has a converse, whereby the solenoidality ofB implies the existence of the vector potential A.  Precedents suggest that one way to prove this is to find a vector field whose curl is a delta function, and use it to construct a vector field with a given curl. To attempt that, we need some more identities. And to obtain those identities, we must take the detour that we have made a virtue of not taking until now…

Coordinates!

[edit | edit source]

Indicial notation; implicit summation

[edit | edit source]

Considering that a scalar field is a function of three coordinates, while a vector field has three components each of which is a function of three coordinates, we can readily imagine that coordinate-based derivations of vector-analytic identities are likely to be excruciatingly repetitive—unless perhaps we choose a notation that concisely specifies the repetition. So, instead of writing the Cartesian coordinates as x, y, z,  we shall usually write them as xi where i ∊ {1, 2, 3};  and instead of writing the unit vectors in the directions of the respective axes as i, j,k ,  we shall usually write them as ei.  And for partial differentiation w.r.t. xi, instead of writing /∂xi or even xi, we shall write i.

Now comes a stroke of genius for which we are indebted to Einstein—although he used it in a more sophisticated context!  Instead of writing the position vector as

or even as

we shall write it simply as

where it is understood  that we sum over the repeated index. More generally, we shall write the vector field q as

with implicit summation, and the vector field v as

with implicit summation, and so on. (By that nomenclature, the position vector in Cartesian coordinates should be, and often is, called x; but we called it r because we wanted to call its magnitude r, for radius.)

Implicit summation not only avoids writing the Σ symbol and specifying the index of summation, but also allows a summation over two repeated indices, say i and j , to be considered as summed first over i and then over j or vice versa, removing the need for an explicit regrouping of terms. Of course, if we hide messy details behind a notation, we need to make sure that it handles those details correctly. In particular, when we perform a notation on an implicit sum, we implicitly perform it term-by-term, and must therefore make sure that the operation is valid when interpreted that way.

Operators in Cartesian coordinates

[edit | edit source]

Gradient:  Putting  s = xi  in (9g), we find that the scalar component of  ∇p in the direction of each ei  is  ∂i p.  To obtain the vector component in that direction, we multiply by ei.  Assembling the components, we have (with implicit summation)

 

 

 

 

(58g)

or, in operational terms,

 

 

 

 

(58o)

or, in traditional longhand notation,

 

 

 

 

(58z)

As reported by Chen-To Tai (1994), there are unfortunately some textbooks in which the del operator is defined as

[sic! ]

—which, on its face, is not an operator at all, but a self-contained expression whose value is the zero vector (because it is a sum of derivatives of constant vectors). Among the offenders is Erwin Kreyszig, who, in the 6th edition of his bestselling Advanced Engineering Mathematics (1988, p. 486), misdefines the del operator thus and then rewrites the gradient of  f  as ∇ f, apparently imagining that the differentiation operators look through the constant vectors rather than at  them. Six pages later, he defines the divergence in Cartesian coordinates (which we shall do shortly) and then immediately informs us that "Another common notation for the divergence of v is ⸱ v," where is defined as before, but the resulting ⸱ v is apparently not identically zero![18] These errors persist in the 10th edition (2011, pp. 396, 402–3). Tai finds similar howlers in mathematics texts by Wilfred Kaplan, Ladis D. Kovach, and Merle C. Potter, and in electromagnetics texts by William H. Hayt and Martin A. Plonus.[19]  Knudsen & Katz, in Fluid Dynamics and Heat Transfer (1958), avoid the misdefinition of ∇, but implicitly define the divergence of V as V⸱  (which, as we have seen, is actually an operator), and then somehow reduce it to the correct expression for  div V. [20]  But I digress.

Curl and divergence:  Expressing the operand of the curl in components, and noting that the unit vectors are uniform, we can apply (8p):

If we sum over j first, this is

 

 

 

 

(59c)

or, in traditional longhand,

For the divergence we proceed as for the curl except that, instead of (8p), we use (8g):

that is,

 

 

 

 

(60d)

or, in traditional longhand,

Although the above expressions for the divergence and curl will surprise many modern readers, they match the initial definitions of the divergence and curl given by the founder of vector analysis, J. Willard Gibbs (1881, § 54).

Gibbs even uses the   and ∇ ×  notations on the left sides of the defining equations, and only after  the equations (albeit immediately after) does he announce that  " ∇ω is called the divergence of ω  and ∇ ×ω  its curl." (He uses Greek letters for vectors.)  He does not offer any justification for the   and ∇ ×  notations, but nor is a justification hard to find. Because ei is a uniform vector, we can rewrite (59c) rigorously as

 

 

 

 

(61)

and thence operationally as

 

 

 

 

(61c)

or, recalling (58o),

which can be evaluated in the usual manner as

where qx is the x component of q, etc. This indeed is how one evaluates the curl of a given field in Cartesian coordinates, although we shall find (59c) more convenient for deriving identities. Similarly, we can rewrite (60d) rigorously as

 

 

 

 

(62)

and thence operationally as

 

 

 

 

(62d)

or, recalling (58o),

For evaluating the divergence of a given field, however, we simplify (62) to

or, in traditional longhand,

although we shall find (60d) more convenient for deriving identities.

Notice that we can get from (62d) back to (60d) by permuting the i with the dot, and from (61c) back to (59c) by permuting the i with the cross, as if the differentiation operator could, as it were, look through the dot or the cross—or, as Gibbs's student Edwin B. Wilson puts it, "pass by" the dot and the cross, yielding Gibbs's original definitions.[21] Hence Wilson considers it helpful to regard Gibbs's   and ∇ ×  notations as "the (formal) scalar product and the (formal) vector product" or "the symbolic scalar and vector products", and to regard as a "symbolic vector".[22]

Tai (1994, 1995) rejects Wilson's argument together with the entire tradition of treating   and ∇ ×  as compound operators. Of formal products, Tai says that the concept "has had a tremendously detrimental effect upon the learning of vector analysis", and calls such a product a "meaningless assembly".[23] Of the "pass by" step, he complains that "standard books on mathematical analysis do not have such a theorem."[24]

I submit, however, that the intermediate steps (61) and (62), after which we take the constant multiplier outside the operator (eqs. 61c & 62d), support Wilson's "pass by" argument. I further submit that the great generality of our derivations of equations (14), above, compels us to treat the   and ∇ ×  notations as more than mere notations. That being said, I shall find some points of agreement with Tai, and some reasons to criticize Wilson.

Laplacian:  If ψ is a scalar field, then

that is,

 

 

 

 

(63L)

with implicit summation. In traditional longhand, this is

or, in operational terms,

or, by comparison with (58z),

—as expected.

By the linearity of the Laplacian, the same applies if ψ is any field expressible in terms of a uniform basis. In particular, if ψ is a vector field given by  ψjej  (with implicit summation), then

Advection, etc.:  If ψ is a scalar field, then

In this double summation, the only non-zero terms are those for which  j = i ,  in which case  ei⸱ ej = 1.  So we have

 

 

 

 

(64)

or, in operational terms,

or, in traditional longhand,

And by the linearity of the directional derivative in (11), the same applies if ψ is a vector field or any field expressible in terms of a uniform basis.

Identities without pain

[edit | edit source]

In deriving the Cartesian expressions for the gradient, curl, divergence, Laplacian, and advection operators, we used identities (9g), (8p), (8g), (9L'), and (11). It will now be a worthwhile exercise to use those newly-derived Cartesian expressions to confirm other previously-mentioned identities.

Ifb is uniform,

confirming (8c); but the confirmation is not independent, because (8c) was used to establish the unambiguity of the curl. More significant would be a verification of (8q): ifb is uniform,

so that we can indeed ignore the question mark on (8q).  Again, in the double summation on the third line above, the only non-zero terms are those for which  j = i ,  in which case  ej⸱ ei = 1.  Only the fourth line exploits the uniformity ofb. If we write  curl q  as  ∇ × q , the identity (8q) looks like the expansion of a vector triple product; and the above proof, based on the Gibbs definition of the curl (59c), actually uses such an expansion. But the identity is valid only for uniform b.

Regardless of the physical meanings of the scalar field ρ and the vector field v ,

—which shows, as promised, that (17) is an identity. Thus the product rule for div follows from the product rule fori.

The expression commonly written as  ∇ × ∇p  is

This is a sum of nine terms. The three terms with  i = j  are zero because the cross-products are zero. For each of the three terms with  i < j ,  there is an equal-and-opposite term with  i > j ,  because interchanging i and j changes the sign of the cross-product while leaving the mixed partial derivative unchanged. So the non-zero terms cancel in pairs and the sum is zero—in agreement with identity (24c).

And the expression commonly written as  ∇  ∇ × q  is

which likewise comes to zero—in agreement with identity (24d).

The above "confirming" exercise was an opportunity to gain familiarity with the Cartesian forms of the operators, verify two identities which were previously only tentative (8q & 17), and take stock of our inventory of identities. The stocktake may have drawn attention to the following shortcomings:

  • we have not yet investigated "grad div" and "curl curl";
  • we have only one product rule in which both factors are spatially variable fields, namely (17); identities (8c) and (8p) need to be generalized accordingly;
  • our collection of product rules does not yet include the curl of a cross-product or the gradient of a dot-product; and
  • we do not yet have any chain rules involving grad, curl, or div.

With the aid of the Cartesian forms of the various operators, we may now fill these gaps.

The "grad div" and "curl curl" operators turn out to be related:

In the first term on the right, we can switch the order of partial differentiation; and in the second term—which, like the first, is a double summation—the only non-zero contributions are those for which  j = i  and  ei⸱ ej = 1.  So we have

that is,

 

 

 

 

(65)

This result may be memorized as "curl curl equals grad div minus del squared " and written as  ∇ × (∇ × q) = ∇ ∇⸱ q − ∇2q ,  which looks like the expansion of a vector triple product; and the above derivation, based on the Gibbs definitions of the operators, uses such an expansion.

Identity (65) may be rearranged as

 

 

 

 

(66)

("del squared equals grad div minus curl curl"), which we could use as a coordinate-free definition of the Laplacian of a vector, if we did not already have one.[25] But we do: we started with a coordinate-free definition (4L) for a generic field, established its unambiguity via (9L), and found its Cartesian form (63L), which we used in the derivation of (66). Wherever we start, we may properly assert by way of contrast that the Laplacian of a vector  is given by (66), whereas the Laplacian of a scalar  is given by the divergence of the gradient. But we should not conclude, as Moon & Spencer do, that representing the scalar and vector Laplacians by the same symbol is "poor practice… since the two are basically quite different",[26] because in fact the two have a common definition which is succinct, unambiguous, and coordinate-free: the Laplacian (of anything) is the closed-surface integral of the outward normal derivative, per unit volume.[u]

If q is solenoidal, the first term on the right of (65) vanishes. Hence for a solenoidal field, the curl of the curl is minus the Laplacian. For example, in the dynamic case, in a vacuum, the Maxwell–Ampère law says that  .  Multiplying this by the physical constant μ0 (called the vacuum permeability or simply the magnetic constant) gives  whence

But, by Gauss's law for magnetism, B is solenoidal, so that [by (65)] the left-hand side of the above is  .  And by Faraday's law,  so that  .  Making these substitutions gives i.e.

By comparison with (45), this is the wave equation with

Thus the Maxwell–Ampère law, Gauss's law for magnetism, and Faraday's law, with the aid of (65), predict the existence of electromagnetic waves together with their speed. It would therefore be difficult to overstate the importance of identity (65).

We now turn our attention to product rules in which neither factor is assumed uniform. The curl of a cross-product is

i.e.,

 

 

 

 

(67c)

The divergence is simpler:

i.e.,

 

 

 

 

(67d)

In particular, in electromagnetics,  div(E × H) ≡ H ⸱ curl EE ⸱ curl H ;  this is the identity on which Poynting's theorem is based.

[To be continued.]

Additional information

[edit | edit source]

Competing interests

[edit | edit source]

None.

Ethics statement

[edit | edit source]

This article does not concern research on human or animal subjects.

TO DO:

[edit | edit source]
  • Keywords
  • Figure(s) & caption(s)
  • Etc.!

Notes

[edit | edit source]
  1. E.g., Feynman (1963, vol. 1, §11-5), having defined velocity from displacement in Cartesian coordinates, shows that velocity is a vector by showing that its coordinate representation contra-rotates (like that of displacement) if the coordinate system rotates.
  2. E.g., Feynman (1963, vol. 1, §11-7), having defined the magnitude and dot-product operators in Cartesian coordinates, shows that they are scalar operators by showing that their representations in rotated coordinates are the same as in the original coordinates (except for names of coordinates and components). And Tai (1995, pp. 40–42), having determined the form of the "gradient" operator in a general curvilinear orthogonal coordinate system, shows that it is a vector operator by showing that it has the same form in any other curvilinear orthogonal coordinate system.
  3. Even if we claim that "particles" of matter are wave functions and therefore continuous, this still implies that matter is lumpy in a manner not normally contemplated by continuum mechanics.
  4. If r is the position of a particle and p is its momentum, the last term vanishes. If the force is toward the origin, the previous term also vanishes, and we are left with conservation of angular momentum about the origin.
  5. Here we use the broad triangle symbol (△) rather than the narrower Greek Delta (Δ); the latter would more likely be misinterpreted as "change in…"
  6. There is no need for parentheses around ρv , because div ρv cannot mean (div ρ)v , because the divergence of a scalar field is not defined.
  7. The material derivative d/dt is also called the substantive derivative, and is sometimes written D/Dt if the result is meant to be understood as a field rather than simply a function of time (Kemmer, 1977, pp. 184–5).
  8. Or nabla, because it allegedly looks like the ancient Phoenician harp that the Greeks called by that name.
  9. But Gibbs (1881) and Wilson (1907) were content to leave it as .  And they did not call it the Laplacian; they used that term with a different meaning, which has apparently fallen out of fashion.
  10. The common perception that they are valid only in Cartesian coordinates arises chiefly from failure to allow for the variability of the basis vectors in other coordinate systems; cfKemmer, 1977, pp. 163–5, 172–3 (Exs. 2, 3, 5), 230–33 (sol'ns), and Feynman, 1963, vol. 2, §2-8 ("Pitfall number two…").
  11. Stress is a second-order tensor, and the origin of the term "tensor"; but, for present purposes, it's just another possible example of a field called ψ.
  12. In mathematical jargon, it should be a two-dimensional manifold embedded in 3D Euclidean space.
  13. If any part of our argument requires Σ or C to be smooth, this is not an impediment, because having approximated Σ or C to any desired accuracy by a polyhedron or polygon, we can then approximate the polyhedron or polygon to any desired higher accuracy by a smooth surface or curve!
  14. Although Hsu (1984, p. 141) applies that name to our theorem (5c).
  15. In the general case, there is an extra term D/∂t on the right; but this term is zero in the magnetostatic case.
  16. Although Hsu (1984, p. 141) applies that name to our theorem (5g).
  17. When a gas is compressed, work is done on it, causing its temperature to rise, so that the ratio of dp to is higher than if the compression were isothermal. In sound waves, there is typically not enough time for a significant part of the heat of compression to be conducted away; that is, the compression is near enough to adiabatic. The words "not enough time" may suggest that the adiabatic approximation is a high-frequency approximation. But in fact, in free air, it is a low-frequency approximation, because as the frequency is reduced, the equalization of temperature is hindered more by the longer wavelength than it is helped by the longer period. Only in a confined space, which limits the required distance of conduction, does the adiabatic assumption require the frequency to be above some lower limit. In a musical wind instrument, that lower limit tends to be far below the audible range. Meanwhile the upper limit, due to easier heat conduction within a shorter wavelength, tends to be very far above the audible range. Thus, under typical conditions, for the purpose of calculating c , the adiabatic assumption is reasonable. (See Fletcher, 1974.)
  18. Or sometimes "quabla", by analogy with "nabla".
  19. In particular, some authorities change the sign, defining as  and some write the operator (however defined) as2.
  20. The symbol c comes from a general-purpose Latin word for speed, but has become the usual symbol for wave speed.
  21. Tai (1995, pp. 43–4) also disagrees with Moon & Spencer, but for a different reason: he regards the Laplacian as the divergence of the gradient even if the operand is a vector field. For better or worse, we do not consider the gradient of a vector in the present paper.

References

[edit | edit source]
  1. Axler, 1995, §9. The relegation of determinants was anticipated by C.G. Broyden (1975). But Broyden's approach is less radical: he does not deal with abstract vector spaces or abstract linear transformations, and his eventual definition of the determinant, unlike Axler's, is traditional—not a product of the preceding narrative.
  2. Axler, 1995, §1. But it is Broyden (1975), not Axler, who discusses numerical methods at length.
  3. There are many proofs and interpretations of this identity. My own effort, for what it's worth, is "Trigonometric proof of vector triple product expansion", Mathematics Stack Exchange, t.co/NM2v4DJJGo, 2024. The classic is Gibbs, 1881, §§ 26–7.
  4. Gibbs, 1881, § 56.
  5. Katz, 1979, pp. 146–9.
  6. In the Feynman Lectures on Physics (1963),  −∇p as the "pressure force per unit volume" eventually appears in the 3rd-last lecture of Volume 2 (§40-1).
  7. A demonstration like the foregoing is outlined by Gibbs (1881, § 55).
  8. Wilson, 1907, pp. 147–8; Borisenko & Tarapov, 1968, pp. 147–8 (again); Hsu, 1984, p. 92; Kreyszig, 1988, pp. 485–6; Wrede & Spiegel, 2010, p. 198.
  9. Gibbs (1881, § 50) introduces the gradient with this definition, except that he calls u simply the derivative of u, and u the primitive of u. Use of the term gradient as an alternative to derivative is reported by Wilson (1907, p. 138).
  10. CfBorisenko & Tarapov, 1968, p. 157, eq. (4.43), quoted in Tai, 1995, p. 33, eq. (4.19).
  11. Kemmer (1977, p. 98, eq. 4) gives an equivalent result for our first three integral theorems (5g to 5d) only, and calls it the generalized divergence theorem because the divergence theorem is its most familiar special case.
  12. E.g., Gibbs, 1884, § 165, eq. (1); Wilson, 1907, p. 255, Ex. 1; Kemmer, 1977, p. 99, eq. (6); Hsu, 1984, p. 146, eq. (7.31).
  13. CfKatz, 1979, pp. 149–50.
  14. E.g., Gibbs 1881, § 61; Hsu, 1984, pp. 117–18.
  15. CfFeynman, 1963, vol. 2, §2-8.
  16. Presumably this is why Gibbs called the gradient simply the derivative (Gibbs, 1881, § 50; cf. §§ 51, 59).
  17. Our definition of strength follows the old convention used by Baker & Copson (1939, p. 42), Born & Wolf (2002, p. 421), and Larmor (1904, p. 5). The newer convention followed by Miller (1991, p. 1371) would use the denominator 4πr instead of our r in (48); this would have the advantage of eliminating the factor 4π from the D'Alembertian of the wave function, and the disadvantage of introducing that factor into (the denominator of) the wave function itself.
  18. The latter passage, as it appears in the 5th edition (p. 397), is the one cited by Tai (1994, p. 6).
  19. Quoted by Tai (1994), in alphabetical order within each category. For Kovach he could have added p. 308.  Potter he misnames as Porter.
  20. Quoted by Tai (1994, p. 23).
  21. Wilson, 1907, p. 150.
  22. Wilson, 1907, pp. 150, 152.
  23. Tai, 1995, pp. 26, 38.
  24. Tai, 1995, p. 28.
  25. CfGibbs, 1881, § 71, and Moon & Spencer, 1965, p. 235; quoted in Tai, 1995, pp. 18, 43.
  26. Moon & Spencer, 1965, p. 236.

Bibliography

[edit | edit source]
  • S.J. Axler, 1995, "Down with Determinants!"  American Mathematical Monthly, vol. 102, no. 2 (Feb. 1995), pp. 139–54; jstor.org/stable/2975348.  (Author's preprint, with different pagination: researchgate.net/publication/265273063_Down_with_Determinants.)
  • S.J. Axler, 2023–, Linear Algebra Done Right, 4th Ed., Springer; linear.axler.net (open access).
  • B.B. Baker and E.T. Copson, 1939, The Mathematical Theory of  Huygens' Principle, Oxford.
  • A.I. Borisenko and I.E. Tarapov (tr. & ed. R.A. Silverman), 1968, Vector and Tensor Analysis with Applications, Prentice-Hall; reprinted New York: Dover, 1979, archive.org/details/vectortensoranal0000bori.
  • M. Born and E. Wolf, 2002, Principles of Optics, 7th Ed., Cambridge, 1999 (reprinted with corrections, 2002).
  • C.G. Broyden, 1975, Basic Matrices, London: Macmillan.
  • R.P. Feynman, R.B. Leighton, & M. Sands, 1963 etc., The Feynman Lectures on Physics, California Institute of Technology; feynmanlectures.caltech.edu.
  • N.H. Fletcher, 1974, "Adiabatic assumption for wave propagation", American Journal of Physics, vol. 42, no. 6 (June 1974), pp. 487–9; doi.org/10.1119/1.1987757.
  • J.W. Gibbs, 1881–84, "Elements of Vector Analysis", privately printed New Haven: Tuttle, Morehouse & Taylor, 1881 (§§ 1–101), 1884 (§§ 102–189, etc.), archive.org/details/elementsvectora00gibb; published in The Scientific Papers of J. Willard Gibbs (ed. H.A. Bumstead & R.G. Van Name), New York: Longmans, Green, & Co., 1906, vol. 2, archive.org/details/scientificpapers02gibbuoft, pp. 17–90.
  • H.P. Hsu, 1984, Applied Vector Analysis, Harcourt Brace Jovanovich; archive.org/details/appliedvectorana00hsuh.
  • V.J. Katz, 1979, "The history of Stokes' theorem", Mathematics Magazine, vol. 52, no. 3 (May 1979), pp. 146–56; jstor.org/stable/2690275.
  • N. Kemmer, 1977, Vector Analysis: A physicist's guide to the mathematics of fields in three dimensions, Cambridge; archive.org/details/isbn_0521211581.
  • E. Kreyszig, 1962 etc., Advanced Engineering Mathematics, New York: Wiley;  5th Ed., 1983;  6th Ed., 1988;  9th Ed., 2006;  10th Ed., 2011.
  • J. Larmor, 1904, "On the mathematical expression of the principle of  Huygens" (read 8 Jan. 1903), Proceedings of the London Mathematical Society, Ser. 2, vol. 1 (1904), pp. 1–13.
  • D.A.B. Miller, 1991, "Huygens's wave propagation principle corrected", Optics Letters, vol. 16, no. 18 (15 Sep. 1991), pp. 1370–72; stanford.edu/~dabm/146.pdf.
  • P.H. Moon and D.E. Spencer, 1965, Vectors, Princeton, NJ: Van Nostrand.
  • W.K.H. Panofsky and M. Phillips, 1962, Classical Electricity and Magnetism, 2nd Ed., Addison-Wesley; reprinted Mineola, NY: Dover, 2005.
  • C.-T. Tai, 1990, "Differential operators in vector analysis and the Laplacian of a vector in the curvilinear orthogonal system" (Technical Report RL 859), Dept. of Electrical Engineering & Computer Science, University of Michigan; hdl.handle.net/2027.42/21026.
  • C.-T. Tai, 1994, "A survey of the improper use of ∇ in vector analysis" (Technical Report RL 909), Dept. of Electrical Engineering & Computer Science, University of Michigan; hdl.handle.net/2027.42/7869.
  • C.-T. Tai, 1995, "A historical study of vector analysis" (Technical Report RL 915), Dept. of Electrical Engineering & Computer Science, University of Michigan; hdl.handle.net/2027.42/7868.
  • E.B. Wilson, 1907, Vector Analysis: A text-book for the use of students of mathematics and physics ("Founded upon the lectures of J. Willard Gibbs…"), 2nd Ed., New York: Charles Scribner's Sons; archive.org/details/vectoranalysisa01wilsgoog.
  • R.C. Wrede and M.R. Spiegel, 2010, Advanced Calculus, 3rd Ed., New York: McGraw-Hill (Schaum's Outlines); archive.org/details/schaumsoutlinesa0000wred.