WikiJournal Preprints/Cut the coordinates! (or Vector Analysis Done Fast)

From Wikiversity
Jump to navigation Jump to search

WikiJournal Preprints
Open access • Publication charge free • Public peer review

WikiJournal User Group is a publishing group of open-access, free-to-publish, Wikipedia-integrated academic journals. <seo title=" Wikiversity Journal User Group, WikiJournal Free to publish, Open access, Open-access, Non-profit, online journal, Public peer review "/>

<meta name='citation_doi' value=>

Article information

Author: Gavin R. Putland[i] 

See author information ▼

Abstract

(Partial draft—under construction.)


Introduction

[edit | edit source]

Sheldon Axler, in his essay "Down with determinants!" (1995) and his ensuing book Linear Algebra Done Right (4th Ed., 2023–), does not entirely eliminate determinants, but introduces them as late as possible and then exploits them for what he calls their "main reasonable use in undergraduate mathematics", namely the change-of-variables formula for multiple integrals.[1] Here I treat coordinates in vector analysis somewhat as Axler treats determinants in linear algebra: I introduce coordinates as late as possible, and then exploit them in unconventionally rigorous derivations of vector-analytic identities from vector-algebraic identities. But I contrast with Axler in at least two ways. First, as my subtitle suggests, I have no intention of expanding my paper into a book. Brevity is of the essence. Second, while one may well avoid determinants in numerical linear algebra,[2] one can hardly avoid coordinates in numerical vector analysis! So I cannot extend the coordinate-light path into computation. But I can extend it up to the threshold by expressing the operators of vector analysis in a suitably general coordinate system, leaving others to specialize it and compute with it. On the way, I can satisfy readers who need the concepts of vector analysis for theoretical purposes, and who would rather read a paper than a book.

The cost of coordinates

[edit | edit source]

Mathematicians define a "vector" as a member of a vector space, which is a set whose members satisfy certain basic rules of algebra (called the vector-space axioms) with respect to another set (called a field), which has its own basic rules of algebra (the field axioms), and whose members are called "scalars". Physicists are more fussy. They typically want a "vector" to be not only a member of a vector space, but also a first-order tensor : a "tensor", meaning that it has an existence independent of any coordinate system with which it might be specified; and "first-order" (or "first-degree", or "first-rank"), meaning that it is specified by a one-dimensional array of numbers. Similarly, a 2nd-order tensor is specified by a 2-dimensional array (a matrix), and a 3rd-order by a 3-dimensional array, and so on; and a "scalar", being specified by a single number (a zero-dimensional array), is a zero-order tensor. In "vector analysis", we are greatly interested in applications to physical situations, and accordingly take the physicists' view on what constitutes a vector or a scalar.

So, for our purposes, defining a quantity by three components in (say) a Cartesian coordinate system is not enough to make it a vector, and defining a quantity as a real function of a list of coordinates is not enough to make it a scalar, because we still need to show that the quantity has an independent existence. One way to do this is to show that its coordinate representation behaves appropriately when the coordinate system is changed. (But don't worry if the following details look cryptic, because we won't be using them!) Independent existence of a quantity means that its coordinate representation is contravariant—that is, the representation changes so as to compensate for the change in the coordinate system.[a] But independent existence of an operator means that its coordinate representation is covariant—that is, the representation of the operator in the coordinate system, with the operand(s) and the result in that system, has the same form in one coordinate system as in another (except for features internal to the system).[b]

I circumvent these complications by the most obvious route: where possible, I initially define the object without coordinates; and if a coordinate-based initial definition is thrust upon me, I promptly seek an equivalent coordinate-free definition. If, having defined something without coordinates, we then need to represent it with coordinates, we can choose the coordinate system for convenience.

The limitations of limits

[edit | edit source]

In the branch of pure mathematics known as analysis, there is a thing called a limit, whereby for every positive ϵ  there exists a positive δ such that if some increment is less than δ, some error is less than ϵ. In the branch of applied mathematics known as continuum mechanics, there is a thing called reality, whereby if the increment is less than some positive δ, the assumption of a continuum becomes ridiculous, so that the error cannot be made less than an arbitrary ϵ. Yet vector "analysis" (together with higher-order tensors) is typically studied with the intention of applying it to some form of "continuum" mechanics, such as the modeling of elasticity, plasticity, fluid flow, or (widening the net) electrodynamics of ordinary matter; in short, it is studied with the intention of conveniently forgetting that, on a sufficiently small scale, matter is lumpy.[c] One might therefore submit that to express the principles of vector analysis in the language of limits is to strain at a gnat and swallow a camel. I avoid that camel by referring to elements of length or area or volume, each of which is small enough to allow some quantity or quantities to be considered uniform within it, but, for the same reason, large enough to allow such local averaging of the said quantity or quantities as is necessary to tune out the lumpiness.

We shall see bigger camels, where well-known authors define or misdefine a vector operator and then want to treat it like an ordinary vector (a quantity). These I also avoid.

Prerequisites

[edit | edit source]

I assume that the reader is familiar with the algebra and geometry of vectors in 3D space, including the dot-product, the cross-product, and the scalar triple product, their geometric meanings, their expressions in Cartesian coordinates, and the identity

a × (b × c)  =  a⸱ c ba⸱b c ,

which we call the "expansion" of the vector triple product.[d] I further assume that the reader can generalize the concept of a derivative, so as to differentiate a vector with respect to a scalar, e.g.

or so as to differentiate a function of several independent variables "partially" w.r.t. one of them while the others are held constant, e.g.

But in view of the above remarks on limits, I also expect the reader to be tolerant of an argument like this: In a short time dt, let the vectors r and p change by and , respectively. Then

where, as always, the orders of the cross-products matter.[e] Differentiation of a dot-product behaves similarly, except that the orders don't matter; and if  p = mv, where m is a scalar and v is a vector, then

Or an argument like this:  If , then

that is, we can switch the order of partial differentiation. If x is an abbreviation for /∂x, etc., this rule can be written in operational terms as

x y = ∂y x .

More generally, if i is an abbreviation for /∂xi where i = 1, 2,…, the rule becomes

i j = ∂j i .

These generalizations of differentiation, however, do not go beyond differentiation w.r.t. real variables, some of which are scalars, and some of which are coordinates. Vector analysis involves quantities that may be loosely described as derivatives w.r.t. a vector—usually the position vector.

Closed-surface integrals per unit volume

[edit | edit source]

The term field, mentioned above in the context of algebraic axioms, has an alternative meaning: if r is the position vector, a scalar field is a scalar-valued function of r, and a vector field is a vector-valued function of r; both may also depend on time. These are the functions of which we want "derivatives" w.r.t. the vector r.

In this section I introduce four such derivatives—the gradient, the curl, the divergence, and the Laplacian—in a way that will seem unremarkable to those readers who aren't already familiar with them, but strange to those who are. The gradient is commonly introduced in connection with a curve and its endpoints, the curl in connection with a surface segment and its enclosing curve, the divergence in connection with a volume and its enclosing surface, and the Laplacian as a composite of two of the above, initially applicable only to a scalar field. Here I introduce all four in connection with a volume and its enclosing surface; and I introduce the Laplacian as a concept in its own right, equally applicable to a scalar or vector field, and only later relate it to the others. My initial definitions of the gradient, the curl, and the Laplacian, although not novel, are usually thought to be more advanced than the common ones—in spite of being conceptually simpler, and in spite of being obvious variations on the same theme.

Instant integral theorems—with a caveat

[edit | edit source]

Let V be a volume (3D region) enclosed by a surface S (a mathematical surface, not generally a physical barrier). Let n̂ be the unit normal vector at a general point on S, pointing out of V. Let n be the distance from S in the direction of n̂ (positive outside V, negative inside), and let n be an abbreviation for /∂n, where the derivative—commonly called the normal derivative—is tacitly assumed to exist.

In V, and on S, let p be a scalar field (e.g., pressure in a fluid, or temperature), and let q be a vector field (e.g., flow velocity, or heat-flow density), and let ψ be a generic field which may be a scalar or a vector. Let a general element of the surface S have area dS, and let it be small enough to allow n̂, p, q, and n ψ to be considered uniform over the element. Then, for every element, the following four products are well defined:

 

 

 

 

(1)

If p is pressure in a non-viscous fluid, the first of these products is the force exerted by the fluid in V  through the area dS. The second product does not have such an obvious physical interpretation; but if q is circulating clockwise about an axis directed through V, the cross-product will be exactly tangential to S and will tend to have a component in the direction of that axis. The third product is the flux of q through the surface element; if q is flow velocity, the third product is the volumetric flow rate (volume per unit time) out of V  through dS ; or if q is heat-flow density, the third product is the heat transfer rate (energy per unit time) out of V  through dS. The fourth product, by analogy with the third, might be called the flux of the normal derivative of ψ through the surface element, but is equally well defined whether ψ is a scalar or a vector—or, for that matter, a matrix, or a tensor of any order, or anything else that we can differentiate w.r.t. n.

If we add up each of the four products over all the elements of the surface S, we obtain, respectively, the four surface integrals

 

 

 

 

(2)

in which the double integral sign indicates that the range of integration is two-dimensional. The first surface integral takes a scalar field and yields a vector; the second takes a vector field and yields a vector; the third takes a vector field and yields a scalar; and the fourth takes (e.g.) a scalar field yielding a scalar, or a vector field yielding a vector. If p is pressure in a non-viscous fluid, the first integral is the force exerted by the fluid in V  on the fluid outside V. The second integral may be called the skew surface integral of q over S ,[3] or, for the reason hinted above, the circulation of q over S.  The third integral, commonly called the flux integral (or simply the surface integral) of q over S, is the total flux of q out of V. And the fourth integral is the surface integral of the outward normal derivative of ψ.

Let the volume V  be divided into elements. Let a general volume element have the volume dV and be enclosed by the surface δS —not to be confused with the area dS of a surface element, which may be an element of S or of δS. Now consider what happens if, instead of evaluating each of the above surface integrals over S, we evaluate it over each δS and add up the results for all the volume elements. In the interior of V, each surface element of area dS is on the boundary between two volume elements, for which the unit normals n̂ at dS, and the respective values of n ψ, are equal and opposite. Hence when we add up the integrals over the surfaces δS, the contributions from the elements dS cancel in pairs, except on the original surface S, so that we are left with the original integral over S. Thus, for the four surface integrals in (2), we have respectively

 

 

 

 

(3)

Now comes a big "if":  if  we define the gradient of p (pronounced "grad p") as

 

 

 

 

(4g)

and the curl of q as

 

 

 

 

(4c)

and the divergence of q as

 

 

 

 

(4d)

and the Laplacian of ψ as [f]

 

 

 

 

(4L)

(where the letters after the equation number stand for gradient, curl, divergence, and Laplacian, respectively), then equations (3) can be rewritten

But because each term in each sum has a factor dV, we call the sum an integral; and because the range of integration is three-dimensional, we use a triple integral sign. Thus we obtain the following four theorems relating integrals over an enclosing surface S  to integrals over the enclosed volume V :

 

 

 

 

(5g)

 

 

 

 

(5c)

 

 

 

 

(5d)

 

 

 

 

(5L)

Of the above four results, only the third (5d) seems to have a standard name; it is called the divergence theorem (or Gauss's theorem or, more properly, Ostrogradsky's theorem[4]), and is indeed the best known of the four—although the other three, having been derived in parallel with it, may be said to be equally fundamental.

As each of the operators , curl, and div calls for an integration w.r.t. area and then a division by volume, the dimension (or unit of measurement) of the result is the dimension of the operand divided by the dimension of length, as if the operation were some sort of differentiation w.r.t. position. Moreover, in each of equations (5g) to (5d), there is a triple integral on the right but only a double integral on the left, so that each of the operators , curl, and div appears to compensate for a single integration. For these reasons, and for convenience, we shall describe them as differential operators. By comparison, the  operator in (4L) or (5L) calls for a further differentiation w.r.t. n ; we shall therefore describe as a 2nd-order differential operator. (Another reason for these descriptions will emerge in due course.) As promised, the four definitions (4g) to (4L) are "obvious variations on the same theme" (although the fourth is somewhat less obvious than the others).

But remember the "if": Theorems (5g) to (5L) depend on definitions (4g) to (4L) and are therefore only as definite as those definitions! Equations (3), without assuming anything about the shapes and sizes of the closed surfaces δS (except, tacitly, that n̂ is piecewise well-defined), indicate that the surface integrals are additive with respect to volume. But this additivity, by itself, does not guarantee that the surface integrals are shared among neighboring volume elements in proportion to their volumes, as envisaged by "definitions" (4g) to (4L). Each of these "definitions" is unambiguous if, and only if, the ratio of the surface integral to dV is insensitive to the shape and size of δS for a sufficiently small δS. Notice that the issue here is not whether the ratios specified in equations (4g) to (4L) are true vectors or scalars, independent of the coordinates; all of the operations needed in those equations have coordinate-free definitions. Rather, the issue is whether the resulting ratios are unambiguous notwithstanding the ambiguity of δS, provided only that δS is sufficiently small. That is the advertised "caveat", which must now be addressed.

In accordance with our "applied" mathematical purpose, our proofs of the unambiguity of the differential operators will rest on a few thought experiments, each of which applies an operator to a physical field, say f, and obtains another physical field whose unambiguity is beyond dispute. The conclusion of the thought experiment is then applicable to any operand field whose mathematical properties are consistent with its interpretation as the physical field f ; the loss of generality, if any, is only what is incurred by that interpretation.

Unambiguity of the gradient

[edit | edit source]

Suppose that a fluid with density ρ (a scalar field) flows with velocity v (a vector field) under the sole influence of the internal pressure p (a scalar field). Then the integral in (4g) is the force exerted by the fluid inside δS on the fluid outside, so that minus the integral is the force exerted on the fluid inside δS. Dividing by dV, we find that −∇p, as defined by (4g), is the force per unit volume,[5] which is the acceleration times the mass per unit volume; that is,

 

 

 

 

(6g)

Now provided that the left side of this equation is locally continuous, it can be considered uniform inside the small δS, so that the left side is unambiguous, whence p is also unambiguous. If there are additional forces on the fluid element, e.g. due to gravity and/or viscosity, then −∇p is not the sole contribution to density-times-acceleration, but is still the contribution due to pressure, which is still unambiguous.

By showing the unambiguity of definition (4g), we have confirmed theorem (5g). In the process we have seen that the volume-based definition of the gradient is useful for the modeling of fluids, and intuitive in that it formalizes the common notion that a pressure "gradient" gives rise to a force.

Unambiguity of the divergence

[edit | edit source]

In the aforesaid fluid, in a short time dt, the volume that flows out of fixed closed surface δS  through a fixed surface element of area dS  is vdt⸱ n̂ dS.  Multiplying by density and integrating over δS, we find that the mass flowing out of δS  in time dt is  .  Dividing this by dV, and then by dt, we get the rate of reduction of density inside δS ; that is,

where the derivative w.r.t. time is evaluated at a fixed location (because δS is fixed), and is therefore written as a partial derivative (because other variables on which ρ might depend—namely the coordinates—are held constant). Provided that the right-hand side is locally continuous, it can be considered uniform inside δS and is therefore unambiguous, so that the left side is likewise unambiguous. But the left side is simply div ρv  as defined by (4d),[g] which is therefore also unambiguous,[6] confirming theorem (5d). In short, the divergence operator is that which maps ρv to the rate of reduction of density at a fixed point:

 

 

 

 

(7d)

This result, which expresses conservation of mass, is a form of the so-called equation of continuity.

The partial derivative ∂ρ/∂t in (7d) must be distinguished from the so-called material derivative /dt, which is evaluated at a point that moves with the fluid.[h] [Similarly, dv/dt in (6g) is the material acceleration, because it is the acceleration of the mobile mass—not of a fixed point! ]  To re-derive the equation of continuity in terms of the material derivative, the volume vdt⸱ n̂ dS, which flows out through dS in time dt (as above), is integrated over δS to obtain the increase in volume of the mass initially contained in dV. Dividing this by the mass, ρ dV, gives the increase in specific volume (1/ρ) of that mass, and then dividing by dt gives the rate of change of specific volume; that is,

Multiplying by ρ² and comparing the left side with (4d), we obtain

 

 

 

 

(7d')

Whereas (7d) shows that div ρv is unambiguous, (7d') shows that div v is unambiguous (provided that the right-hand sides are locally continuous). In accordance with the everyday meaning of "divergence", (7d') also shows that div v is positive if the fluid is expanding (ρ decreasing), negative if it is contracting (ρ increasing), and zero if it is incompressible. In the last case, the equation of continuity reduces to

[for an incompressible fluid].

 

 

 

 

(7i)

For incompressible flow, any tubular surface tangential to the flow velocity, and consequently with no flow in or out of the "tube", has the same volumetric flow rate across all cross-sections of the "tube", as if the surface were the wall of a pipe full of liquid (except that the surface is not necessarily stationary). Accordingly, a vector field with zero divergence is described as solenoidal (from the Greek word for "pipe"). More generally, a solenoidal vector field has the property that for any tubular surface tangential to the field, the flux integrals across any two cross-sections of the "tube" are the same—because otherwise there would be a net flux integral out of the closed surface comprising the two cross-sections and the segment of tube between them, in which case, by the divergence theorem (5d), the divergence would have to be non-zero somewhere inside, contrary to (7i).

Unambiguity of the curl (and gradient)

[edit | edit source]

The unambiguity of the curl (4c) follows from the unambiguity of the divergence. Taking dot-products of (4c) with an arbitrary constant vector b, we get

that is, by (4d),

[for uniform b].

 

 

 

 

(8c)

(The parentheses on the right, although helpful because of the spacing, are not strictly necessary, because the alternative binding would be (div q), which is a scalar, whose cross-product with the vector b is not defined. And the left-hand expression does not need parentheses, because it can only mean the dot-product of a curl with the vector b; it cannot mean the curl of a dot-product, because the curl of a scalar field is not defined.) This result (8c) is an identity if the vector b is independent of location, so that it can be taken inside or outside the surface integral; thus b may be a uniform vector field, and may be time-dependent. If we make b a unit vector, the left side of the identity is the (scalar) component of curl q in the direction of b, and the right side is unambiguous. Thus the curl is unambiguous because its component in any direction is unambiguous. This confirms theorem (5c).

Similarly, the unambiguity of the divergence implies the unambiguity of the gradient. Starting with (4g), taking dot-products with an arbitrary uniform vector b, and proceeding as above, we obtain

[for uniform b].

 

 

 

 

(8g)

(The left-hand side does not need parentheses, because it can only mean the dot-product of a gradient with the vector b; it cannot mean the gradient of the dot-product of a scalar field with a vector field, because that dot-product would not be defined.) If we make b a unit vector, this result (8g) says that the (scalar) component of p in the direction of b is given by the right-hand side, which again is unambiguous. Thus we have a second explanation of the unambiguity of the gradient: like the curl, it is unambiguous because its component in any direction is unambiguous.

We might well ask what happens if we take cross-products with b on the left, instead of dot-products. If we start with (4g), the process is straightforward: in the end we can switch the order of the cross-product on the left, and change the sign on the right, obtaining

[for uniform b].

 

 

 

 

(8p)

(Again no parentheses are needed.) If we start with (4c) instead, and take b inside the integral, we get a vector triple product to expand, which leads to

 

 

 

 

(8q)

Here the first term on the right is simply  ∇ b⸱q  (the gradient of the dot-product). The second term is more problematic. If we had a scalar p instead of the vector q, we could take b outside the second integral, so that the second term would be (minus) b ⸱ ∇p. This suggests that the actual second term should be (minus) b ⸱ ∇q. But we do not yet know how to interpret that expression for a vector field q; and if we were to adopt the second term above (without the sign) as the definition of b⸱∇ q (treating b⸱ as an operator), that would be open to the objection that b⸱∇ q had been defined only for uniform b, whereas b ⸱ ∇p (for scalar p) is defined whether b is uniform or not. So, for the moment, let us put (8q) aside and run with (8c), (8g), and (8p).

Another meaning of the gradient

[edit | edit source]

Let ŝ be a unit vector in a given direction, and let s be a parameter measuring distance (arc length) along a path in that direction. By equation (8g) and definition (4d), we have

where, by the unambiguity of the divergence, the shape of the closed surface δS enclosing dV can be chosen for convenience. So let δS be a right cylinder with cross-sectional area dS  and perpendicular height ds , with the path passing perpendicularly through the end-faces; and let the cross-sectional dimensions be small compared with ds  so that the values of p at the end-faces, say p and p+dp, can be taken to be the same as where the end-faces cut the path, at (say) parameter-values s and s+ds  respectively, where the outward unit normal n̂ takes the values ŝ and ŝ  respectively (because the end-faces are perpendicular to the path). Then  dV = dS ds , and the surface integral over δS includes only the contributions from the end-faces (because n̂ is perpendicular to ŝ elsewhere); those contributions are respectively  and    i.e.   and .  With these substitutions the above equation becomes

that is,

 

 

 

 

(9g)

where the right-hand side, commonly called the directional derivative of p in the ŝ direction,[7] is the derivative of p w.r.t. distance in that direction. Although (9g) has been obtained by taking that direction as fixed, the equality is evidently maintained if s measures arc length along any path tangential to ŝ at the point of interest.

Equation (9g) is so labeled because it is an alternative definition of the gradient: it says that   is the vector whose component in any direction is the directional derivative of  in that direction. For real, this component has its maximum, namely |p|, in the direction of p; thus  is the vector whose direction is that in which the derivative of  w.r.t. distance is a maximum, and whose magnitude is that maximum. This is the usual conceptual definition of the gradient.[i]

If  ŝ is tangential to a level surface of p (a surface of constant p), then s p  in that direction is zero, in which case (9g) says that p (if not zero) is orthogonal to ŝ.  So  is orthogonal to the surfaces of constant   (as we would expect, having just shown that the direction of p is that in which p varies most steeply).

If p is uniform—that is, if it has no spatial variation—then its derivative w.r.t. distance in every direction is zero; that is, the component of p in every direction is zero, so that p must be the zero vector. In short, the gradient of a uniform scalar field is zero.

Unambiguity of the Laplacian

[edit | edit source]

Armed with our new definition of the gradient (9g), we can revisit our definition of the Laplacian (4L). If ψ is a scalar field, then, by (9g),  can be replaced by in (4L), which then becomes

 

 

 

 

(9L)

that is, by definition (4d),

[for scalar ψ].

So the Laplacian of a scalar field is the divergence of the gradient. The unambiguity of the Laplacian then follows from the unambiguity of the divergence and the gradient.

If, on the contrary, ψ in definition (4L) is a vector field, then we can again take dot-products with a uniform vector b, obtaining

If we make b a unit vector, this says that the scalar component of the Laplacian of a vector field, in any direction, is the Laplacian of the scalar component of that vector field in that direction. As we have just established that the latter is unambiguous, so is the former.

But the unambiguity of the Laplacian can be generalized further. If

where each is a scalar field, and each αi is a constant, and the counter i ranges from (say) 1 to k , then it is clear from (4L) that

 

 

 

 

(10)

In words, this says that the Laplacian of a linear combination of fields is the same linear combination of the Laplacians of the same fields—or, more concisely, that the Laplacian is linear. I say "it is clear" because the Laplacian as defined by (4L) is itself a linear combination, so that (10) merely asserts that we can regroup the terms of a nested linear combination; the gradient, curl, and divergence as defined by (4g) to (4d) are likewise linear. It follows from (10) that the Laplacian of a linear combination of fields is unambiguous if the Laplacians of the separate fields are unambiguous. Now we have supposed that the fields are scalar and that the coefficients αi are constants. But the same logic applies if the "constants" are uniform basis vectors (e.g., i, j,k), so that the "linear combination" can represent any vector field, whence the Laplacian of any vector field is unambiguous. And the same logic applies if the "constants" are chosen as a "basis" for a space of tensors of any order, so that the Laplacian of any tensor field of that order is unambiguous, and so on. In short, the Laplacian of anything that we can express with a uniform basis is unambiguous.

The dot-del, del-cross, and del-dot operators

[edit | edit source]

The gradient operator is also called del.[j] If it simply denotes the gradient, we tend to pronounce it "grad" in order to emphasize the result. But it can also appear in combination with other operators to give other results, and in those contexts we tend to pronounce it "del".

One such combination is "dot del"— as in "b⸱∇ ", which we proposed after (8q), but did not manage to define satisfactorily for a vector operand. With our new definition of the gradient (9g), we can now make a second attempt. A general vector field q can be written |q| q̂ , so that

If ψ is a scalar field, we can apply (9g) to the right-hand side, obtaining

where sq is distance in the direction of q. For scalar ψ, this result is an identity between previously defined quantities. For non-scalar ψ, we have not yet defined the left-hand side, but the right-hand side is still well-defined and self-explanatory (provided that we can differentiate ψ w.r.t. sq). So we are free to adopt

 

 

 

 

(11)

where sq is distance in the direction of q , as the general definition of the operator q⸱∇ , and to interpret it as defining both a unary operator  q⸱ which operates on a generic field, and a binary operator  which takes a (possibly uniform) vector field on the left and a generic field on the right.

For the special case in which q is a unit vector ŝ , with s measuring distance in the direction of  ŝ , definition (11) reduces to

 

 

 

 

(12)

which agrees with (9g) but now holds for a generic field ψ [whereas (9g) was for a scalar field, and was derived as a theorem based on earlier definitions]. So ŝ∇ , with a unit vector s , is the directional-derivative operator on a generic field.

In particular, if  ŝ = n̂  we have

which we may substitute into the original definition of the Laplacian (4L) to obtain

 

 

 

 

(13L)

which is just (9L) again, except that it now holds for for a generic field.

If our general definition of the gradient (4g) is also taken as the general definition of the operator,[8] then, comparing (4g) with (4c), (4d), and (13L), we see that

where the parentheses may seem to be required on account of the closing dS in (4g). But if we write the factor dS before the integrand, the del operator in (4g) becomes

if  we insist that it is to be read as a operator looking for an operand, and not as a self-contained expression. Then, if we similarly bring forward the dS in (4c), (4d), and (13L), the respective operators become

 

 

 

 

(14)

(pronounced "del cross", "del dot", and "del dot del"), of which the last is usually abbreviated as 2  ("del squared").[k] Because these operational equivalences follow from coordinate-free definitions, they must remain valid when correctly expressed in any coordinate system.[l] That does not mean that they are always convenient or always conducive to the avoidance of error—of which we shall have more to say in due course. But they sometimes make useful mnemonic devices. For example, they let us rewrite identities (8c), (8g), and (8p) as

for uniform b.

 

 

 

 

(15)

These would be basic algebraic vector identities if  were an ordinary vector, and one could try to derive them from the "algebraic" behavior of ; but they're not, because it isn't, so we didn't !  Moreover, these simple "algebraic" rules are for a uniform b, and do not of themselves tell us what to do if b is spatially variable; for example, (8g) is not applicable to (7d).

The advection operator

[edit | edit source]

Variation or transportation of a property of a medium due to motion with the medium is called advection (which, according to its Latin roots, means "carrying to"). Suppose that a medium (possibly a fluid) moves with a velocity field v in some inertial reference frame. Let ψ be a field (possibly a scalar field or a vector field) expressing some property of the medium (e.g., density, or acceleration, or stress,[m]… or even v itself). We have seen that the time-derivative of ψ may be specified in two different ways: as the partial derivative ∂ψ/∂t , evaluated at a fixed point (in the chosen reference frame), or as the material derivative /dt, evaluated at a point moving at velocity v (i.e., with the medium). The difference  /dt − ∂ψ/∂t is due to motion with the medium. To find another expression for this difference, let s be a parameter measuring distance along the path traveled by a particle of the medium. Then, for a short time interval dt, the surface representing the small change in ψ as a function of the small changes in t and s  (plotted on perpendicular axes) can be taken as a plane through the origin, so that

that is, the change in ψ is the sum of the changes due to the change in t and the change in s . Dividing by dt gives

i.e.,

(and the first term on the right could have been written t ψ). So the second term on the right is the contribution to the material derivative due to motion with the medium; it is called the advective term, and is non-zero wherever a particle of the medium moves along a path on which ψ varies with location—even if ψ at each location is constant over time. If ψ is v itself, the above result becomes

where the left-hand side (the material acceleration) is as given by Newton's second law, and the first term on the right (which we might call the "partial" acceleration) is the time-derivative of velocity in the chosen reference frame, and the second term on the right (the advective term) is the correction that must be added to the "partial" acceleration in order to obtain the material acceleration. This term is non-zero wherever speed is non-zero and varies along a path, even if the velocity at each point on the path is constant over time (as when water speeds up while flowing at a constant volumetric rate into a nozzle). Paradoxically, while the material acceleration and the "partial" acceleration are apparently linear (first-degree) in v, their difference (the advective term) is not. Thus the distinction between ∂ψ/∂t and /dt  has the far-reaching implication that fluid dynamics is non-linear.

Applying (11) to the last two equations, we obtain respectively

 

 

 

 

(16)

and

 

 

 

 

(16v)

where, in each case, the second term on the right is the advective term. Thus v⸱ is the operator which maps a property of a medium to the advective term in the time-derivative of that property; in short, v⸱ is the advection operator.

When the generic ψ  in (16) is replaced by the density ρ , we get a relation between ∂ρ/∂t and /dt, both of which we have seen before—in equations (7d) and (7d') above. Substituting from those equations then gives

 

 

 

 

(17)

where ρ can be taken as a gradient since ρ is scalar. This result is in fact an identity—a product rule for the divergence—as we shall eventually confirm by another method.

Generalized volume-integral theorem

[edit | edit source]

We can rewrite the fourth integral theorem (5L) in the "dot del" notation as

 

 

 

 

(18L)

Then, using notations (14), we can condense the four integral theorems (5g), (5c), (5d), and (18L) into the single equation

 

 

 

 

(19)

where the "circ" symbol is a generic binary operator which may be replaced by a null (direct juxtaposition of the operands) for theorem (5g), or a cross for (5c), or a dot for (5d), or  for (18L). This single equation is a generalized volume-integral theorem, relating an integral over a volume to an integral over its enclosing surface.[n]

Theorem (19) is based on the following definitions, which have been found unambiguous:

  • the gradient of a scalar field p is the closed-surface integral of  n̂ p per unit volume, where n̂ is the outward unit normal;
  • the divergence of a vector field is the outward flux integral per unit volume;
  • the curl of a vector field is the skew surface integral per unit volume, also called the surface circulation per unit volume; and
  • the Laplacian is the closed-surface integral of the outward normal derivative, per unit volume.

The gradient maps a scalar field to a vector field; the divergence maps a vector field to a scalar field; the curl maps a vector field to a vector field; and the Laplacian maps a scalar field to a scalar field, or a vector field to a vector field, etc.

The gradient of p, as defined above, has been shown to be also

  • the vector whose component in any direction is the directional derivative of p in that direction (i.e. the derivative of p w.r.t. distance in that direction), and
  • the vector whose direction is that in which the directional derivative of p is a maximum, and whose magnitude is that maximum.

Consistent with these alternative definitions of the gradient, we have defined the   operator so that  ŝ (for a unit vector ŝ) is the operator yielding the directional derivative in the direction of  ŝ , and we have used that notation to bring theorem (5L) under theorem (19).

So far, we have said comparatively little about the curl. That imbalance will now be rectified.

Closed-circuit integrals per unit area

[edit | edit source]

[To be continued.]

Additional information

[edit | edit source]

Competing interests

[edit | edit source]

None.

Ethics statement

[edit | edit source]

This article does not concern research on human or animal subjects.

TO DO:

[edit | edit source]
  • Abstract
  • Keywords
  • Figure(s) & caption(s)
  • Etc.!

Notes

[edit | edit source]
  1. E.g., Feynman (1963, vol. 1, §11-5), having defined velocity from displacement in Cartesian coordinates, shows that velocity is a vector by showing that its coordinate representation contra-rotates (like that of displacement) if the coordinate system rotates.
  2. E.g., Feynman (1963, vol. 1, §11-7), having defined the magnitude and dot-product operators in Cartesian coordinates, shows that they are scalar operators by showing that their representations in rotated coordinates are the same as in the original coordinates (except for names of coordinates and components). And Chen-To Tai (1995, pp. 40–42), having determined the form of the "gradient" operator in a general curvilinear orthogonal coordinate system, shows that it is a vector operator by showing that it has the same form in any other curvilinear orthogonal coordinate system.
  3. Even if we claim that "particles" of matter are wave functions and therefore continuous, this still implies that matter is lumpy in a manner not normally contemplated by continuum mechanics.
  4. There are many proofs and interpretations of this identity. My own effort, for what it's worth, is at math.stackexchange.com/a/4839213/307861. The classic is Gibbs, 1881, §§ 26–7.
  5. If r is the position of a particle and p is its momentum, the last term vanishes. If the force is toward the origin, the previous term also vanishes.
  6. Here I use the broad triangle symbol (△) rather than the narrower Greek Delta (Δ); the latter would more likely be misinterpreted as "change in…"
  7. There is no need for parentheses around ρv , because div ρv cannot mean (div ρ)v , because the divergence of a scalar field is not defined.
  8. The material derivative d/dt is also called the substantive derivative, and is sometimes written D/Dt if the result is meant to be understood as a field rather than simply a function of time (Kemmer, 1977, pp. 184–5).
  9. Gibbs (1881, § 50) introduces the gradient with this definition, except that he calls u simply the derivative of u, and u the primitive of u. Use of the term gradient as an alternative to derivative is reported by Wilson (1907, p. 138).
  10. Or nabla, because it allegedly looks like the ancient Phoenician harp that the Greeks called by that name.
  11. But Gibbs (1881) and Wilson (1907) were content to leave it as .  And they did not call it the Laplacian; they used that term with a different meaning, which has apparently fallen out of fashion.
  12. The common perception that they are valid only in Cartesian coordinates arises chiefly from failure to allow for the variability of the basis vectors in other coordinate systems; cfKemmer, 1977, pp. 163–5, 172–3 (Exs. 2, 3, 5), 230–33 (sol'ns).
  13. Stress is a second-order tensor, and the origin of the term "tensor"; but, for present purposes, it's just another possible example of a field called ψ.
  14. Kemmer (1977, p. 98) gives an equivalent result for our first three integral theorems (5g to 5d) only, and calls it the generalized divergence theorem because the divergence theorem is its most familiar special case.

References

[edit | edit source]
  1. Axler, 1995, §9. The relegation of determinants was anticipated by C.G. Broyden (1975). But Broyden's approach is less radical: he does not deal with abstract vector spaces or abstract linear transformations, and his eventual definition of the determinant, unlike Axler's, is traditional—not a product of the preceding narrative.
  2. Axler, 1995, §1. But it is Broyden (1975), not Axler, who discusses numerical methods at length.
  3. Gibbs, 1881, § 56.
  4. Katz, 1979, pp. 146–9.
  5. In the three-volume Feynman Lectures on Physics (1963),  −∇p as the "pressure force per unit volume" eventually appears in the 3rd-last lecture of Volume 2 (§40-1).
  6. A demonstration like the foregoing is outlined by Gibbs (1881, § 55).
  7. Wilson, 1907, pp. 147–8; Borisenko & Tarapov, 1968, pp. 147–8 (again); Kreyszig, 1988, pp. 485–6; Wrede & Spiegel, 2010, p. 198.
  8. CfBorisenko & Tarapov, 1968, p. 157, eq. (4.43), quoted in Tai, 1995, p. 33, eq. (4.19).

Bibliography

[edit | edit source]
  • S.J. Axler, 1995, "Down with Determinants!"  American Mathematical Monthly, vol. 102, no. 2 (Feb. 1995), pp. 139–54; jstor.org/stable/2975348.  (Author's preprint, with different pagination: researchgate.net/publication/265273063_Down_with_Determinants.)
  • S.J. Axler, 2023–, Linear Algebra Done Right, 4th Ed., Springer; linear.axler.net (open access).
  • A.I. Borisenko and I.E. Tarapov (tr. & ed. R.A. Silverman), 1968, Vector and Tensor Analysis with Applications, Prentice-Hall; reprinted New York: Dover, 1979, archive.org/details/vectortensoranal0000bori.
  • C.G. Broyden, 1975, Basic Matrices, London: Macmillan.
  • R.P. Feynman, R.B. Leighton, & M. Sands, 1963 etc., The Feynman Lectures on Physics, California Institute of Technology; feynmanlectures.caltech.edu.
  • J.W. Gibbs, 1881–84, "Elements of Vector Analysis", privately printed New Haven: Tuttle, Morehouse & Taylor, 1881 (§§ 1–101), 1884 (§§ 102–189, etc.), archive.org/details/elementsvectora00gibb; published in The Scientific Papers of J. Willard Gibbs (ed. H.A. Bumstead & R.G. Van Name), New York: Longmans, Green, & Co., 1906, vol. 2, archive.org/details/scientificpapers02gibbuoft, pp. 17–90.
  • V.J. Katz, 1979, "The history of Stokes' theorem", Mathematics Magazine, vol. 52, no. 3 (May 1979), pp. 146–56; jstor.org/stable/2690275.
  • N. Kemmer, 1977, Vector Analysis: A physicist's guide to the mathematics of fields in three dimensions, Cambridge; archive.org/details/isbn_0521211581.
  • E. Kreyszig, 1962 etc., Advanced Engineering Mathematics, New York: Wiley;  5th Ed., 1983;  6th Ed., 1988;  9th Ed., 2006;  10th Ed., 2011.
  • P.H. Moon and D.E. Spencer, 1965, Vectors, Princeton, NJ: Van Nostrand.
  • W.K.H. Panofsky and M. Phillips, 1962, Classical Electricity and Magnetism, 2nd Ed., Addison-Wesley; reprinted Mineola, NY: Dover, 2005.
  • C.-T. Tai, 1990, "Differential operators in vector analysis and the Laplacian of a vector in the curvilinear orthogonal system" (Technical Report RL 859), Dept. of Electrical Engineering & Computer Science, University of Michigan; hdl.handle.net/2027.42/21026.
  • C.-T. Tai, 1994, "A survey of the improper use of ∇ in vector analysis" (Technical Report RL 909), Dept. of Electrical Engineering & Computer Science, University of Michigan; hdl.handle.net/2027.42/7869.
  • C.-T. Tai, 1995, "A historical study of vector analysis" (Technical Report RL 915), Dept. of Electrical Engineering & Computer Science, University of Michigan; hdl.handle.net/2027.42/7868.
  • E.B. Wilson, 1907, Vector Analysis: A text-book for the use of students of mathematics and physics ("Founded upon the lectures of J. Willard Gibbs…"), 2nd Ed., New York: Charles Scribner's Sons; archive.org/details/vectoranalysisa01wilsgoog.
  • R.C. Wrede and M.R. Spiegel, 2010, Advanced Calculus, 3rd Ed., New York: McGraw-Hill (Schaum's Outlines); archive.org/details/schaumsoutlinesa0000wred.